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Professor Artin knows that most of us were in 18.701, but for those of us that weren't, he likes to be called Mike. 


We'll start the semester with group representations. 


Definition 1 


A matrix representation of a group G is a homomorphism 


RenG— Gil 


where G is a finite group and GL, = GL,(C). 


In other words, the map R takes a group element and returns a matrix. We'll denote the matrix corresponding to 


g as Rg, and because R is a homomorphism, we know that RgRp = Rgp for all g, hE G. 


Example 2 


What are some matrix representations for G = $3? 


1. We can start with the permutation representation, where we just write down the permutation matrix that 
corresponds to our group element. $3 is generated by x = (123), y = (23), so It suffices to write down P, and 
P,. And remember that we permute the columns, not the entries, so the defining elements are 


0 0 1 
P= LQ Py = 0 
0 1 0 


oOo OO fF 
re oO OO 
oO - © 


2. Alternatively, we can use the sign representation (which takes even permutations to 1 and odd permutations to 


—1): 
z= (1). = (41) 
This shows that we can use 1 by 1 matrices as representations — we just treat them like complex numbers. 


3. Finally, there is a standard representation using the fact that S3 = D3 is the dihedral group for a triangle. 


Then (defining c = cos —$,s=sin% v3) we have 


There are three measures of importance for mathematical concepts: 
+ Utility, 
- Beauty, 


- ...the last one was forgotten. 


But group representations are both useful and beautiful. And they're useful in other fields as well: 


Example 3 


Benzene Is basically a hexagon at some initial time t = 0, but atoms making up the molecule are moving around 


with some velocity. 


In principle, we can figure out what happens to the molecule, but it’s easier to figure out vibrational modes, and 


representation theory helps with this! 


Fact 4 


Professor Artin wanted to be a chemist, but then he changed his mind. “Don't read the chemistry textbook.” 


Under the permutation representation P, there’s a fixed vector 


Let's suppose we change basis to some (v1, V2, v3), where vy; = v. Then the new representation is P’, and P can be 
represented as OBO for some fixed basechange matrix Q. Specifically, this matrix Q has column vectors vj, V2, V3. 


But now PF, is of the form 


lo * * 
0 A B 
0 C D 


where the asterisks are “junk.” 


Proposition 5 (Maschke’s theorem, version 1) 


We can eliminate the junk by choosing v2 and v3 carefully. In other words, if we have a vector fixed by all elements 


in our representation, we can change basis so that the first row and column are all 0 except for a 1 on the diagonal. 


Specifically, if vy is sent to itself, the orthogonal space to the span of v; is sent to itself. We're lucky here because 
the matrices in the representation P, and P, are already orthogonal! So now if we just choose our other basis vectors 
such that 


Vo, V3 € Vie. 


we automatically get that 


A B . . ; ; a . 
and ( °) will be another representation. Spoiler alert: it'll be the standard representation in some basis. So 


this allows us to say that Pg is isomorphic to T @ A, where T is the trivial representation (since we had a 1 in the top 
left corner of P). 

Generally, we try to work without a chosen basis. We will need one to explicitly write down a representation, but 
it’s like linear transformations: it’s easier to think of them without a basis at first. 

Next time, there will be less notation, but we do need to set everything up today. Let V be a vector space, and let 
GL(V) be the group of invertible linear operators on V. If the dimension of V is n and we have some basis (v1,°-- , Vn), 


then GL(V) corresponds to GL(n) by corresponding p to R (the matrix of p). 


Definition 6 


Let V be a vector space. A representation of G on V is a homomorphism G > GL(V). 


Again, we write g + g, and we still have PgPp = Pgn by the definition of a homomorphism. 


Definition 7 


If pjW CW for all g, then p,W = W (since pg is invertible), so we call W an invariant subspace of V. 


Then if we choose a basis V = (v,-+- , V;,°** ,V,), where the first r vectors are in W and the remaining form 


another subspace U, the matrix Rg of pg will be 


where the top left matrix Ag is r by r, and A is the restriction of R to W. This corresponds to another representation 
a:G-—+ GL(W), and what Maschke’s theorem says is that we can get rid of the * junk again! This means we can 
say that B is the restriction of R to U, and if we denote @ to be the homomorphism G > GL(U), 


p=aegepB 


This is called a direct sum, and we can write our representation in this way if there exist subspaces W, U both invariant 
under p, and V=W OU. 


Definition 8 


An irreducible representation GL(V) is one where there does not exist a proper invariant subspace. 


The sign representation is obviously irreducible. Why is the standard representation irreducible? Any proper 
subspace would have dimension 1, and if it were invariant, it would need to be an eigenvector. And there are no 


vectors that are simultaneously eigenvectors for A, and Ay. 


Definition 9 


The character x of a representation P is a function on G such that 


x(g) = tr(pg). 


Let’s make a character table: let x; be the character for the trivial representation, x2 be the one for sign 
representation, and x3 be the one for the standard representation. We know what all the matrices look like, so filling 


out the table directly is pretty easy: 


Here, we should recall that trace of a matrix is invariant under conjugation. x and x? are conjugate, and 
y, Xy, X2y are conjugate, so that’s why those columns look identical. 


Three interesting facts, which turn out to not be coincidences: 


+ The characters are constant on conjugacy classes. 


x|? — that is, the sum of the squares of the characters of all group elements — is always 6 in all cases, which is 


the size of the group. 


+ Treating each x; as a vector, all three are orthogonal to each other. 


Theorem 10 


Let G is a finite group. Then the irreducible characters are orthogonal, and their squared lengths are all |G]. 


Notice we didn’t list the reducible representation P in our character table. But that’s because any direct sum is a 
linear (integer) combination of the characters! 


Here's a better way to state Maschke's theorem: 


Theorem 11 (Maschke's theorem) 


Every representation of a finite group G is a direct sum of irreducible representations. 


With this, if we have any representation p with character x, we can just use the projection formula to compute 
the direct sum x = >> ax; by finding the inner product with each character (we'll go into more detail with this later). 
In our case, Xp = X1 + XB. 


We can also consider representations as rotations of R?, but we're out of time. 
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To answer a question posed at class, we're always assuming the vector spaces are complex. It's possible for us to have 
real vector spaces, but then we have a real representation that is actually just complex. (This is not a particularly 
important idea, aside from a few modifications to our definitions. ) 

To recap, if V is a complex vector space, GL(V) is the set of invertible linear operators on V, and G is a finite 


group, then a representation of G on V is a homomorphism p: G + GL(V) sending g > pg. 


Proposition 12 


Representations of G on V correspond bijectively to linear operations of G on V. 


Let's describe the group operation: for a group element g, we send v to g * v, where we need to follow the rule 
g*x(h*xv)=(gh)*v, l*xv=v. 
Since the operation is linear, we also have 
g*(vtV)=gevtge*evig*(cv)=cg*v. 


This is the analog to treating group actions as permutations from 18.701 (every group action corresponds to an 
element of the permutation group), and now it makes sense to drop the * from our notation. So pg(v) will be denoted 
gv. 

Last time, we also defined an invariant subspace W to be one where pgW C W for all g € G. (Since g has an 
inverse, this also means pgW = W.) Now if W is an invariant subspace, we can restrict p to W and get a representation 
on W. This motivates having a definition of an irreducible representation: it’s one where there Is no proper invariant 
subspace. We then found that if W and U are invariant, and V =W @®U, pis a direct sum of the restrictions to W 
and U. Picking a basis 


V=(Y,v2-°° 1 Viki Vk+4 0 Vn) 


where the first k elements correspond to W and the remainder correspond to U, the matrix of pg (for each g) with 


Ag 0 
R= ; 
BG 


(And Maschke’s theorem says that every representation is the direct sum of irreducible representations, which 


respect to the basis is (in block form) 


makes a lot of things nicer.) 
Still doing review here: we defined a character x of a representation o, which is a map from the group G to a 


complex number C, sending g to the trace of pg. 


Example 13 


For any representation p, we have x(1) = dimV, because p, is the identity matrix. 


Notation-wise, we will also call this dim and dim x. 


Fact 14 


x is constant under conjugation, since the trace is invariant under conjugation. (This is because trace is commu- 


tative, and this means that x(hgh-!) = x(hh-1g) = x(qg).) 


Remember that (from the characteristic polynomial), the trace of a matrix is the sum of the eigenvalues. So 
x(g) = x(h) for all characters x if g and h are in the same conjugacy class. 


Last time, we wrote down a character table for irreducible representations of S3: 


Y XY xy 
1 21 1 £1 1 


Notice that all rows have the same “length” as vectors, and that each length squared is the order of the group. Since 


all elements in a given conjugacy class look identical in this table, it’s enough to just write down one column for each 


conjugacy class. There’s a catch though: we know the row vectors in our table are also orthogonal, but this is less clear 
if we write the table with conjugacy classes instead of elements. Thus, it's customary to use the compact character 
table, where we pick only one element from each conjugacy class (and write a number above to indicate how many 


elements of each group we have): 


(1) (2) (3) 
1 x y 
X1 1 1 1 
Xo 1 1 -1 
x3 2 -1 0 


It's nice if our characters all have “length” 1, so we'll divide through by the order of the group. Specifically, let's 


define an inner product: 


Definition 15 
If x, x’ are characters of G, then let 


Ca a= a S> x(9)x’ (9). 


gEG 


Notice the similarities with the standard Hermitian form on C”: 
OY) =20Y => xi 
i 


where we have the properties (Y, X) = (X,Y) and (X, X) = |X/?. 


So now we can compute those inner products for the irreducible representations of S3: 


1 —. — eee 
(xa»xa) = = (1+ xa(D)xa(L) + 2- xalxalx) +3: Xa(V)xaly)) = 0. 
as expected. We also find that (x1,x1) = 1 and so on, and now let’s restate this as a general theorem: 


Theorem 16 (Main Theorem) 


Let X1,X2,°:: , Xx be the irreducible characters of a group. 


1. These characters are orthonormal under the defined inner product: (xj, x,) is 1 if / = / and O otherwise. 


2. The number of irreducible characters is the number of conjugacy classes of G. 


3. If dy, do,--- are the dimensions of x;, then >> d? —(Gi 


4. Finally, if p, o’ are representations of G, then p is isomorphic to p’ if and only if x = x’. 


Elaborating on point (2), we know that if the characters are orthogonal, we do need to have at most as many 
characters as conjugacy classes. The Main Theorem tells out that this is an equality! And now we can compute the 


character table without needing to find the representations explicitly. One of the characters must be trivial: 


(4) (2) (3) 
1 x y 
Xi) 1 1 1 
X2 
X3 


Now, we have to have the sum of squares of entries in the first column be 6: there’s only one way to do this. 


(1) (2) (3) 
1 x y 
X1 1 1 1 
x2} 1 
xa | 2 


But for the two-dimensional representation, x is a 3-cycle, so if X and X’ are the eigenvalues of py, ~e=al= 
3 = 3 = 1. Letting w be a cube root of unity, the other cube root of 1 is w* =@ =w!. 

The important thing here is that x(x*) = x(x), since x and x? are conjugate. Since x is the sum of the eigenvalues 
and x? has eigenvalues that are the square of x's, we either have x(x) = 1+1 or w+W. The former is too big — 
since (x3, X%3) = 1, we can’t have 2-2+2-2-2= 12 in the sum for the inner product. Thus x%3(x) = —1, and for 
(x3,X3) = 1, we must have x3(y) = 0. (Alternatively, use the fact that x1 and x3 are orthogonal.) 


(1) (2) (3) 
1 x y 
Xi) 1 1 1 
x2} 1 
x3 | 2 -1 0 


Filling out the rest is pretty routine. We're already one hour behind the syllabus because we don't have Maschke’s 


theorem proved, but this is a cool method for finding character tables! 
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Let’s do a quick review: we define a character x of a representation p : G + GL(V) via x(g) = tr(p,). Then we 
defined a Hermitian form 
(x)= a S> x(9)x’ (9). 
g 

Last time, we stated Theorem 16, the Main Theorem: if x1,---,xX« are the irreducible characters for a group G, 
then they are orthonormal under our Hermitian form. We also have some other computational results: the number of 
irreducible characters is the number of conjugacy classes, and if we let d; be the dimension of x; (the dimension of the 
matrices for the corresponding representation), then >> d? = |G|. Finally, two representations with the same character 
are isomorphic: in other words, if we pick the right bases, the matrices for the two representations will be the same. 

We also stated Theorem 11, Maschke’s theorem: every representation is isomorphic to a direct sum of irreducible 
representations, so every character is a sum of irreducible characters. Since our irreducible characters are orthonormal, 


we can actually decompose using the projection formula: 
La eM ee 
i 


Let’s go back to our example: 


Example 17 


Take G = S3: X? = y? = 1, yx = x?y. Recall that our character table here is 


To make some progress on proving these results, we'll need to define a few special representations. Recall that 


there’s a permutation representation which we described at the beginning of class: 


Definition 18 
Given a finite set S = {Sj,--- ,S,}, and suppose G operates on S. (This means each group element g € G 
permutes the elements of S.) The permutation representation R : G — GL, is defined such that Rg is the 


permutation matrix associated with the permutation of elements of S. 


00 1 
For example, S3 acts on the set {1,2,3}, so x = (123) can be written as |} 1 O OJ], and y = (23) can be 
0 1 0 


1 0 0 

written as |O0 OQ 1]. Then the character of R, denoted xr(g), just tells us the number of elements fixed by g 
0 1 0 
the o 


(since those are the on-diagonal elements). 


Definition 19 


Let G operate on itself using left multiplication: g sends h to gh. Then the regular representation is the 


permutation representation associated with this operation. 


For example, the dimension of the regular representation for S3 is 6, because we are permuting the 6 elements of 
the group G. But notice that any element other than the identity fixes nothing in our group, so the trace will always 


be 0 in those cases. This means that the character X;eg takes on a simple form: 


QQ) (2) (3) 
1 x y 
X1 1 1 1 
X2 1 1 -1 
X3 2 <i 
Xreg 6 0 


Decomposing this character by the projection formula, notice that most of the terms in our sum disappear: 
1 —__ 1 
(Xreg, Xi) = 6 S” Xreg(9)Xi(9) = 5 [6 7 xi(1) + 0] = xi(1) = dj. 
g 


And this will be true in general as well: 


Proposition 20 


The character of the regular representation of a group G satisfies 


Xreg = S- xidj. 


The above equation tells us that for g £ 1, 
O= x xi(g) di, 
i 
but more importantly, we can immediately prove point (3) of Theorem 16! If we plug in g = 1, we find that 


|G| =D xi(Dd = ra) 


as desired. 
Next, let’s consider the case where G is an abelian group. Conjugation is trivial (ghg~t = h for all g, h), so all 
elements are in their own conjugacy class. So because the number of conjugacy classes is |G|, and that’s also equal 


to the number of irreducible representations by point (2) of Theorem 16, we have 


IG| 


le|= >. a. 
i=1 


Therefore, all dj must be equal to 1 — all of our representations are just homomorphisms from G to C. 


Example 21 


Let’s write down the character table for G = C3 = (x). 


Then the character table looks like 


Q) @ @® 
if x 2 
x1 dh 1 1 
x2} 1 
x3 | l 


To fill out the rest of the table, each entry is just the trace of a 1 by 1 matrix, which is the single entry in that 
matrix. Since x? = 1, all of these entries must be cube roots of 1. And there's exactly three cube roots of one, so 


the rest is pretty easy: 


(i) @) dj 
1 x x? 
Militii. if 
X. | 1 WW? 
x3) 1 we w 


Using this idea, we can also describe one-dimensional representations of any finite group. Let G be arbitrary: 
now we have a homomorphism p: G + GL; = C*. The complex numbers are abelian under multiplication, so we're 


forcing abelian relations on our group now. 


Let G be the abelianization of G, which is just some quotient of G by the set of all commutators in G. Then 


one-dimensional representations of G correspond to one-dimensional representations of the abelian group G. 


Example 22 


Let's use this to find the one-dimensional representations of G = S3 again. 


In G, we have Xy = yx = X°y, so X = 1. This means G is the cyclic group with two elements, and this has two 
one-dimensional representations. So S3 also has two 1-dimensional representations: indeed, these come from the first 


and second rows of the character table. 


Example 23 


It’s time to compute a new character table: let G = T be the rotational symmetries of a tetrahedron. 


Remember that rotations are conjugate If they have the same angle. Letting x be a rotation of on around a vertex, 


and letting z be a rotation of a around an edge, we have our conjugacy classes: 


(1) (4) (4) (3) 


2 


1 x x Z 


According to the main theorem, we'll have four irreducible characters. We need the sum of the squares of the 
dimensions to satisfy >> d? = |G| = 12, so we need to add the three remaining squares to get 11. There’s only one 


way to do that, so we can fill out some more of the character table: 


(1) (4) () (8) 
1 x < 2 
x1 1 1 1 dl 
x2} 1 
xe | I 
x4 | 3 


Remember that G, the abelianization, will tell us information about the one-dimensional representations. There 
are 3 one-dimensional representations, so G is the cyclic group of order 3. That means we're quotienting by a group 
of order 4 to get the abelianization, and the only way to do that is to have the conjugacy class of z combine with the 
identity (so that we have three different conjugacy classes each corresponding to 4 elements of 7). Thus, z = 1 in 


the abelization, and thus x2(z) = x3(z) = 1. 


(ly (4) 4) 
i x x 2 
x1 1 1 1 1 
X2 1 1 
x3 | 1 1 
x4 | 3 


Now we can fill the rest of x2 and x3 In using the C3 character table, and we're almost done: 


10 


Gy, 44) te Aa) 
1 xX xX? 2 
x | 1 #1 1 1 
X2 1 W Ww 1 
x3 Hd W w 1 
x4 | 3 


The bottom right corner, x4(z), is the sum of 3 eigenvalues for a matrix p4(z). But z* = 1, so all eigenvalues are 


+1. Adding up three such eigenvalues, we know that x4(Z) is either 3,1,—1, or 3, but since the length of x4 should 


be /12, we can't have 3 or —3. In fact, 32-1+ (+1)? -3 = 12 exactly, and that means the other entries for x and 
2 


x* must be 0. And this gives us the whole table: 
(1) (4) (4) (3) 
1 x x« z 
Xi] 1 1 1 1 
X2 1 W Ww il 
x3 1 W w 1 
Ka 3 0 —1 


where we fill in the last —1 using orthonormality of the characters. 


Fact 24 


We could have also guessed that x4 corresponds to the standard representation: rotation by @ in R® has trace 


1+ 2cos0, and 1, x, z correspond to 6 = 0, or T. 
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It's true that the dimension of an irreducible character divides the order of the group. This is hard to prove, but we're 
allowed to use it for our problem set. 
Let's warm up by doing a character table for the symmetric group S,. The idea with S, is that permutations of 


the same cycle type are in the same conjugacy class: we have 
* 1 identity, 
* 3 pairs of transpositions, 
¢ 8 3-cycles, 
* 6 transpositions, 
* 6 4-cycles. 


With this, we can start constructing our table: 


11 


(1) (3) (8) (6) = (6) 

1 (12)(34) (123) (12) (1234) 
X1 1 1 1 1 1 
X2 1 1 1 -1 -1 
X3 
x4 
X5 


The first two permutations are the trivial and sign homomorphisms, and now the only way to have the sum of the 


squares of the dimensions add up to 24 is the following: 


(1) (3) (8) (6) (6) 
1 (12)(34) (123) (12) (1234) 
xi | 1 1 1 1 1 
vy A 1 Tsai “sh 
x3 | 2 
x4 | 3 
x5 | 3 


Consider the conjugacy class of (123). Since (123)* = 1, the eigenvalues of any matrix representing it are either 


w or W. Since (123)? is also in the same conjugacy class, we must have x3((123)) = 1+1 orw+W. The former is 


too large for the length of x3 to be 1, so we must have w+W = —1. 
(1) (3) (8) | (6) (6) 
1 = (12)(34) (123) | (12) (1234) 
X1 | 1 1 1 1 1 
X2 | 1 1 1 -1 -1 
x3 | 2 -1 
X4 3 
x5 | 3 


Now notice that x1 and xX» only differ in the last two columns, so orthonormality requires x3((12))+x3((1234)) = 0. 


But x3((12)) is either 2,0, or —2 by an eigenvalue argument, and +2 are too large. Thus, the last two columns of 


x3 must be 0, and then we finish x3 by checking orthogonality against x1: 


(1) (3) (8) | (6) (6) 
1 (12)(34) (123) | (12) (1234) 
xvi | 1 1 1 1 1 
> | 1 1 Pe led st 
x3] 2 2 -1 0 0 
x4 | 3 
x5] 3 


Similarly, for x%4((123)) and x5((123)), we must have 1+w-+W@ = 0, and then x4(12) is either 3,1, —1, or —3 by 


arguments like those above. 3 Is too big, so now we have a bit more of our table: 


12 


(1) (3) (8) | (6) = (6) 
1 (12)(34) (123) | (12) = (1234) 
Xi | 1 1 1 1 1 
X2 | 1 1 1 -1 -1 
x3 | 2 2 -1 0 0 
X4 | 3 +1 
X5 | 3 +1 
To make our job easier, we can calculate Xperm: 
(1) (3) (8) | (6) (6) 
1 = (12)(34) (123) | (12) = (1234) 
X1 1 1 1 1 1 
X2 1 1 1 -1 -1 
X3 2 2 -1 0 0 
Xa 3 0 +1 
X5 3 0 +1 
Xperm | 4 0 1 2 0 


Since (Xperm: Xperm) = 2, tt must have the sum of two irreducible characters, one of which is the identity. After 


some more work, we end up with our final character table: 


(1) (3) (8) | (6) = (6) 

1 = (12)(34) (123) | (12) (1234) 
X1 1 1 1 1 1 
X2 1 1 1 -1 -1 
X3 2 2 -1 0 0 
Xa 3 -1 0) -1 
X5 3 -1 0 -1 1 
Xperm | 4 0 1 2 0 


Let’s move on. Let p be the representation of G on a vector space V, and let’s use the notation pg(v) = gv. We'll 


list our group elements as 


G={1,92,°°: aks 


and we'll find a G-invariant subspace W (such that pg,WV = W for all g) as follows: 


Lemma 25 


Let v € V be an arbitrary vector, and let vy, = gjv for all 1 <i <n. Then W, the span of the vectors v,,--- 


is Invariant. 


»Vny 


Proof. We want to show that if h € G, then hW Cc W. It’s enough to show that each of the basis vectors hv are in 


W. This is clear because 


hy, = hgiv = (hgi)v = giv =v 


for some /, and this is an element of W by definition. 


13 


Corollary 26 


If o is irreducible, then the dimension of V is at most |G| = n. (After all, any p with dimension greater than n 


would have this invariant subspace. ) 


Unfortunately, remember that this is useless, since we already know from the equation a = |G| that the 
dimensions is at most \/n. But we can do something a bit less useless: let’s find an invariant vector 7, so hv = 7 for 


all h € G. We're going to “average over the group,” and this is an important concept: 


Proposition 27 


Let G be a group acting on a vector space V. Then for any v € V, the vector 


is Invariant under G. 


Proof. The key idea is that multiplication by G is a bijective map. For any hE G, 


2 1 1 
hv=h (aro) = Taj hv 


and since we're summing over the group, and since the set of hg runs over the whole group — just in a different order 


— this is identical to 7, as desired. 


For example, we send the elements of S3 to a different permutation under 
{1,x, x2, y, xy, x°y} zt fy, xy, xy, 1,x7, x}. 


And since an irreducible nontrivial representation should have no fixed points, this fixed point 7 must be the zero 
vector for all irreducible representations except the trivial one. 


With this, we finally have all the tools we need to prove Theorem 11, Maschke’s theorem: 


Proof. We want to show that if W is a proper invariant subspace of V, then there is an invariant subspace U such 
that V =W@®U. If this is true, then we can repeat the process on W and U by induction: choose a basis for 
V = (1, Vo,°°* Vp, Veet, *** y Vn), Where the first r vectors form a basis for W and the last n— r vectors form a basis 


for U. Then Rg, the matrix of pg, must look like 


We can then inductively break down our matrix until each subspace Is irreducible. 
Tentatively, we want to pick U = W", because we know that V=W@W1t. 


Fact 28 


Here, we use a lemma from 18.701: if pg Is unitary, V is a Hermitian space, and W is G-invariant, then W+ is 


G-invariant and V =W @W4. (The specific fact we use is that if W is T-invariant, then W/+ is 7*-invariant, 


and T* = 77! for ar unitary operator.) 


14 


There are two potential issues: we may not have a positive definite Hermitian form on V (though we can always 
choose one), and even if we do have one, the operators p, may not be unitary. (Recall that an operator T ona 
Hermitian space V is unitary if (v, w) = (Tv, Tw) for all v, w.) Well, we'll resolve both at the same time: we'll find 


a positive definite Hermitian form on V so that 


(v, w) = (gv, gw) Vg, 


and then we'll be able to apply the above lemma. Start with some arbitrary positive definite Hermitian form {v, w} 


(we can just pick the standard form in some orthonormal basis), and we'll average over G: define 
1 
(Vv, W) = 1G] yy, gw}. 
g 


This is a Hermitian form, because each of the {gv, gw} is linear in the second variable and follows Hermitian anti- 
symmetry. It's also positive definite, because each of the {gv, gw} are nonnegative. So now we just need to show 


that ‘ 
(hv, hw) = ia S“{ghv, ghw} 
g 


for any h € G. But since right multiplication in G is a bijective map, this is the same sum as (v, w) in a different 


order, and we're done. 


Thus we can find an invariant subspace W+t, and Maschke’s theorem holds by induction! 


This construction is called the “unitary trick.” Let's show this in action for more concreteness: 


Example 29 


1 
Start with the matrix R = ‘ I Notice that R? = /, so we get a matrix representation of the cyclic group 


G = {1, g} by sending g to R and 1 to the identity /. 


Letting {X,Y} = X*Y be the standard Hermitian form, the “good form” we want to use is 


1 1 ego: a 
(X,Y) = S(X*Y + (RX)*RY) = SX + R°RYY = SX" (; _ Y 


Now we verify that R is indeed unitary: we have 
1.4/2 1 
(RX, RY) = =X*R RY, 
2 1 3 


and indeed this expands out to the same expression as (X,Y). 


5 February 15, 2019 


Let’s warm up by doing a character table for a nonabelian group of order |G| = 55. Recall that by the Sylow theorem 
for p = 11, there is a normal subgroup of order 11, and therefore there are 10 elements of order 11. Then there is a 
not-normal group of order 5 (or else our group is abelian), so there are 11 Sylow 5-groups and therefore 44 elements 


of order 5. These, plus the identity, give all of the elements of our group. 
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Let x be an element of order 11 and y be an element of order 5. Because the subgroup generated by x is normal, 


1 


yxy! = x! for some i. But since y has order 5, conjugating x by y five times tells us that we need i> = 1 mod 11, 
so let's take / = 3 (all / 4 1 are isomorphic). 
Now the conjugacy class of y has the elements C(y) = {y, xy, x°y,--- , xty}, since x~!yx = x?y. This gives 


us four conjugacy classes (for each of the powers of y). Then we also have C(x) = {x, x3, x9, x27, x84} and C(x71) 


(which covers the other non-identity powers of x). Thus, we have the following character table skeleton (where x6 


and x7 have dimension 5 from an exercise): 


(1) 05), (SP CT). Gp, GD) (t) 


x ere oY y? y 


1 1 1 1 1 1 


Ol OF FR FR RR RT Re 


Let ¢ be a fifth root of unity. The 1-dimensional characters have to give fifth powers of 1, so x;(y) must be a 


power of ¢. Also, yxy~? = x? implies 


representations! This fills out a signifi 


that x is sent to 1 in the abelianization, so x is 1 in each of the one-dimensional 


cant amount of the table, and now we can work on the next row: 


G1) 05) (5). CI) GED) Gi) d) 


i ae Ga ee ee Ga 
X1 1 1 1 1 il 1 1 
Ms.) 1 1 u ¢ Ge e Cc 
Xe. | 1 1 i 'S C ¢ G 
x4 | 1 1 i G ¢ c es 
pa ee 1 1 Cc os G ¢ 
X6 5 u Vv a b Cc d 
X7 | 5 


We must have orthogonality between x. and any of x1 through x5. But there is a common factor of 5+ 507+ 5v 


in all 5 of those expressions, and now 


—5 


we have the five different equations 
a+b+c+4+d 
a+ (b+ Get+ Cd 
—5u—-5v=11-4 (a+ ¢4b+ (e+ Cd 
Cat b+ Ct+Cd 
Gag bres Cd 


But if we add up all five of these equations, the right hand side is just 0. So that means 5+ 5u+v =O. In fact, this 
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also tells us that a, b, c, d are all zero, because the top four equations correspond to the equation 


1 1 21 +214] 4a 0 
Cee | IS), |e 
ee ¢ €/ lz ioe 
cc ¢8 69 Cl Id 0 


and this 4 x 4 matrix is actually invertible because the fifth roots of unity are distinct: 


Theorem 30 (Vandermonde) 


For any n, 


Proof. For an n by n matrix, the degree of the polynomial determinant is 0+1+2+---+(n—1)= on But note 


that (x — xj) is always a factor, since x; = x; makes the determinant 0 (two columns are identical). Multiplying all 


such factors together, we've gotten to the degree of the polynomial, and now we can just multiply by a scalar — we'll 


omit showing that the constant factor is correct. 


So now our question is just to find the four missing entries below: 


(1) (5) (8) Ga) (1) (4) @) 

i eee ce Oe 
Xi 1 1 1 1 1 1 1 
| 1. a 1 ¢ c ¢ Cc 
S| 2 4 1 c c ¢ C 
Mee 1 c ¢ C Se 
xe. t 7 1 C C C ¢ 
X6 5 u V 0 ) 0 
x7 | 5 u! v! 0 0 


Xe is a five-dimensional representation, and x6(x) is the sum of the eigenvalues of o¢(x), so u is a sum of five 11th 
roots of unity. Let 7 be one of them: now u is a sum of 5 powers of 7, but if a conjguacy class contains x, it contains 


x?. So if 7! is one of the eigenvalues involved in x(x), so are 7°’, n°! 72", n*/. There's only two possibilities: 


3 9 


m+n 
uU= 


no 
n2 +n 


rn 


Picking them to be uy, and Up arbitrarily and using orthonormality, we find that our completed character table looks 
like 
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(1) (3) 6) Gl) G1) @)) (1) 


xt 


xX 
iS 
Ol OF FR RR RP HB] Re 


Ud uy 0 0 0 0 
We can explicitly calculate u, and us, since they are the roots of a quadratic 


2 _ x + (garbage). 


x? — (uy + Ue)x + UU = x 
And the garbage is 3, because expanding out uyUp gives 25 terms: there are 5 ones and 2 sums of all the other roots. 


Fact 31 


By the way, Vandermonde was a violinist — Professor Artin learned this on a wiki. 


Time to move on to the actual topic of today. Suppose we have two representations G of a group: let them be 
a B 
G+GL(U), GoGL(V). 
Then saying that a is isomorphic to G? means that if we choose the correct basis, the matrices ag, 6, are the same 
for all g. But another way is to say that there exists an isomorphism of vector spaces V 4, U which is compatible with 
the operation of G. Here’s the commutative diagram: 


VY —> U 
pa 
V 


T 
T 


So we can now redefine the idea of isomorphism more formally. 


Definition 32 


. fC 5 4 5 5 : T 
a and 6 are isomorphic representations if there exists an isomorphism V — U of vector spaces such that 


of =O, er 1G 


for all g € G. We say that the linear transformation T is G-invariant if 76, = agT for all g. 


Note here that a@ is a representation in V, and @ is a representation in U. 


Lemma 33 


If T : V — U's invariant, then ker T is an invariant subspace of V and ImT is an invariant subspace of U. 


Proof. Let K =kerT. lf v € K, then T(v) =0, and our goal is to show that 6,K C K for all g. Note that 


O=aglv=TBgv 
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since T is invariant, and therefore Bg(v) € K for all ve K 
On the other hand, we want to show that a,(ImT) is contained in ImT (which is a subspace of U). If ue lmT, 
then u = Tv for some v. But then 


Agu=aglv=TBgv, 


SO Qgu is T of something and Is therefore in the image as well. 


Note that it’s very hard for T to be invariant: we need T6, = agT to be satisfied for all g. Schur’s lemma basically 


says there aren't any such operators: 


Lemma 34 (Schur's lemma) 


Suppose that a@ and @ are irreducible representations. Then either T = 0 or T is an isomorphism. 


Proof. Since a and @ are irreducible, the only invariant subspaces are 0 and the whole vector space. So the kernel of 
T is either 0 (in which case T is injective), or V (in which case T = 0). Similarly, the image is either U (in which case 


T is surjective), or O (in which case T = 0). So unless T is the zero operator, it’s both injective and surjetive and is 


therefore an isomorphism. 


Lemma 35 (Schur's lemma, part 2) 


ae , ‘ : : 
Suppose U -> U is invariant for an irreducible representation a. Then T = c/ for some constant c. 


Proof. Take an eigenvalue » of 7. Then we can see that S = T —A/ is invariant, and now because there is an eigenvalue 


of 0, the kernel is nonzero. Thus, the kernel must be the whole vector space U, and therefore S =0 => T =Al. 


Lemma 36 


Te ; : ; ' ; ; 
Let V > U be an arbitrary linear transformation and a, G@ be representations. Then the linear transformation 


Se ail 
ne je 2 oa TBs 
g 


is Invariant. 


Proof. We want to show that if h € G, then 
a, T Br = a 


Since the sum is linear, 


and we’re summing over all group elements g’ = gh, so this is just 7 by definition. 


But this T has to be zero if ag and Bg are not isomorphic, and we'll see next time how to use these to discuss 


orthogonality between characters. 
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6 February 19, 2019 


We'll derive the orthogonality relations today. Let's start with a review: let G % GL(U) and G & GL(V) be two 
representations, and let V 4 U be a linear transformation. Then we defined T to be invariant if TBg = QgT for all 


g. We found that we can produce invariant linear transformations by averaging 
i Se, te 
[a] Se 8s: 


This T is invariant, since aa and T are both adding over the whole group, just in a different order. We also 
learned that (by Schur’s lemma) if a@ and 6 are both irreducible and not isomorphic, then the only invariant linear 
transformation from V to U is zero. Also, if @ is an irreducible representation and T is an invariant linear operator 
from U to itself, then T = cl for some c. 

Now we just want to extract the characters out of this. We'll use the matrix notation, so we'll use specific complex 
vector spaces U + C™”, V + C", and we'll send ag, Bg to matrices A, and Bg. Then V +, U becomes m by n matrix 
M. 


Definition 37 


Let C’*" be the space of m x n matrices. Define a linear operator ® on this space 


1 _ 
(M) = ial SAG MBg. 
g 


This is basically just our invariant T with different notation. We want to find the trace of ® — we'll start with 
the function F : C’*" — C™*" which sends M to AWB. Remember that we have an m x m matrix A and annxn 
matrix B: let Az,--- , Am be the eigenvalues of A and f41,--- , ny be the eigenvalues of B (which are the same as the 
eigenvalues of Br). So now if X;, ¥; are the eigenvectors of A, BT’ respectively, with eigenvalues Aj, 4j, then xiy,T 


(defining this to be (V/;;) is an eigenvector of F with eigenvalue Ajj, since 
F(My) = AXiY;" B = (AX;)(Yj" B) = (AXi)(BTY)™ = im) (uy?) = Ai My. 


But this gives m x n eigenvectors of the form Mjj, so that’s all of them! Sure, A;4; might be equal for different is and 
Js, but most of the time, they're distinct, so this works in general by continuity. 


So now note that the trace of F is the sum of the eigenvalues, which is 


S7 iy = Or t+ + Am) (la +2 + Mn) = tr A trB. 
y 


But now trace is linear (the trace of A+ B is the trace of A plus the trace of B), and now we've arrived at our 


result: since ® is defined as a linear combination of A;'MBxs, we can write down an explicit formula for the trace. 


Lemma 38 


The trace of the linear operator ® defined above is 


tro= ai a4. Ge) — ai S" xXa(97")xa(9). 


We're working with a finite group here, so every element has finite order. This means that x(g~+) = x(g), because 
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x(g) is a sum of the eigenvalues of ag, in which all eigenvalues are roots of unity and therefore 


So we can conclude that 


trd = (Xa XB) 


with the Hermitian form that we've defined on characters. But @ is an invariant transformation, so if a and 6G are 
irreducible and not isomorphic, it’s zero (we have orthogonality of irreducible characters). 


Now, let's work a little harder and get orthonormality, which is part (1) of Theorem 16: 


Proof. The reason we keep averaging (dividing by the order of the group) is that if M is already invariant, 
1 1 
o(M) = — A, We. M=M 


by the definition of invariance for M. Another way to phrase this is that whenever / = &(M), 6? =. This is called 
a projection operator, and with such an operator, the space (in this case C™%*") is the direct sum of the image and 
the kernel of ©. In this case, M= M@(M-—M). 


Lemma 39 


Our linear operator ©, as defined above, satisfies 


trd’=dimlm®. 


This is because a projection operator has eigenvalues of 1 for all elements in Im, but it has eigenvalues of 0 for 
everything in the kernel — adding these up yields the result. So now by Schur's lemma, when a = 8, an invariant 
operator on the vector space must be a scalar multiple of the identity. So all operators on the space Im © are scalar 


multiples of the identity, and this can only happen if dimilm® = 1. Thus 


SaiXa) = o=dmime=1, 


as desired. 


Example 40 


It's time to do another character table: we'll work with a nonabelian group of order 8. 


Turns out there are two of them — the quaternion group and D4 — but they have the same character table, so we 
shouldn't need to figure out which one it is. 

If all irreducible representations have dimension 1, then G is abelian, so that’s ruled out (since the number of 
conjugacy classes would be 8). We have the trivial representation, and there are some representations with dimension 
at least 2 (they can’t be of dimension 3). Thus, we must have the sum of squares add to 8: this is just 1, 1,1, 1,2. 
And it turns out the class equation for a nonabelian group of order 8 can't be 1+ 1+ 1+1+4, so it must be 


1+1+2+2+2 (or else we'd have problems with orthogonality). So here’s all the progress we've made so far: 
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(1) () @) @) @ 
whe et &- & a 
x2} 1 
x3 | 1 
x4 | 1 
XS] 2 


Now we can “observe” that the next three rows can be like this: 


a i a er 
weWia a t at 
veh ££ ft a 
el a «i 4 xt 
KS] 2 


and now by orthogonality, it’s easy to find the last character: 


“la 2 @ tft 4 
eo 4 ££ «al <i 4 
ve | Te @ st al 
a(t ft <P & 2 
c/s & G¢ OG @ 


The next hour was originally going to be spent on representations of SU>, the special unitary group, but we're 


going to move on to ring theory instead. 


7 February 20, 2019 


Today, we're starting a new area of algebra. 


Definition 41 
A ring R is a set with operations addition, subtraction, multiplication, and a multiplicative identity 1. Basically, 


+ and x are two laws of composition, and we have the following axioms: 
« (R, +) is an abelian group with identity 0. 


¢ Multiplication is associative and commutative with identity 1. We're assuming all rings are commutative 


here, but this is not the definition that everyone uses. 


* Distributivity holds: a(b+c) = ab+ ac. 


Here are some examples of rings: 
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Example 42 


Any field is a ring, since it has addition and multiplication, and 1 4 0. 


(We don't want to rule out the possibility that 1 = 0 in rings, but we'll come back to that later on.) 


Example 43 
Z is a ring of integers. Z[/], the Gaussian integers of the form {a+ bi, a,b € Z}, is also a ring. Finally, R[x], 


the set of polynomials in x with real coefficients, is a ring. 


Lemma 44 


Given any ring Rand a€ R, a-0=0. 


Proof. We know that 
0=0+4+0 = a-0=a-(0+0)=a-0+a-0 


and subtract a-0 from both sides (since the cancellation law holds for addition). 


Fact 45 


Note that if 0 = 1, the ring consists of 0 alone. This is because 


for all a in the ring. 


This is called the zero ring, though it doesn’t seem to be particularly important. 


Polynomials, on the other hand, are a pretty important type of ring: 


Definition 46 
Let R be any ring. Then the polynomial ring is defined as 


Td ee ey oe 2, 


with arbitrary aj € R and nan arbitrary nonnegative integer. 


Then we just do polynomial multiplication as normal: 
(amX™ + Am—1X™ 1) +--+ + a9)(byx"” +--+ + by) = S- a;x! bjx!. 
ij 
Thus, the coefficient of x* is 
al = agbo + ag—1b1 + +++ + agbk = S- aj Dy ;. 
i 


We can check the ring axioms all work here — addition follows component-wise, but multiplication takes more work 
and isn’t very interesting. 


Recall the division algorithm for positive integers: 
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Fact 47 


Let a, b be positive integers. Then there exist unique integers g, r such that 


b=aq+r,0<r<a. 


We want to do something similar for our polynomial rings R[x] as well. We do have to be careful, since we can’t 


divide x* + 1 by 3x (for example) while still having integer coefficients. 


Proposition 48 
Let f,g € R[x], and let f be a monic polynomial (its leading coefficient a, is 1). Then there exist q,r € R[x} 
such that 

GS 1g 


and r = 0 or the degree of r(x) is less than the degree of f. 


In school, we probably learn polynomial long division. (It depends on what country we're from?) Basically if the 


leading term Is b,x" and we divide by something with leading term x’, we get a b,x"~™ leading coefficient. 


Fact 49 (Unimportant) 
Musical staves come from strings on a musical instrument. Tablature has to do with how this is notated. But 


this may also have to do with long division? This is also related to other “logical German things.” 


A similar property holds for Z[/]: 


Proposition 50 (Division algorithm for the Gaussian integers) 
Given a, 6 € Z[i] with a # 0, there exist g, r € Z[/] such that 


B=aq+tr,|r| < lal. 


Proof. Gaussian multiples of a include a, ia, —a, —ia, so we can form a grid {(a+ bi)a: a,b € Z}. This is a square 
grid, and it tiles the plane. 

Now if @ is in one of the squares, it'll be within |a| of one of the vertices. Subtract multiples of a until we move 
that vertex to 0. Now r is just the difference between G and the closest vertex, and aq is whatever else was subtracted 
off. 


Note that this answer 6G = aq +r may not be unique, since there are sometimes multiple vertices of our grid that 
are within |a| of B. 


The next definition is even more important for rings than for groups: 


Definition 51 
A ring homomorphism ¢: R — R’ for rings R, R’ is a map such that 


o(a+ b) = b(a)+ O(b), (ab) = b(a)o(b), bC.R) = Ie 


for all a,bER. 
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Notice that we needed to specify that the multiplicative identity goes to the other ring’s multiplicative identity, 
since we don't always have multiplicative inverses! Otherwise, we could have (for example) just sent everything in R 
to Og, which wouldn't preserve the multiplicative structure in the ways that we want. On the other hand, we don’t 


need to say that #(0r) = Op, since that is automatic from the group structure of addition. 


Proposition 52 


For any ring R, there exists a unique homomorphism @: Z > R. 


We haven't actually defined addition and multiplication in the integers yet, so this may be kind of hard to prove. 
But the idea is that #(1z) goes to 1r (this is unique), and then $(2) = (1 +1) = (1) + #(1), and so on, which 
uniquely determines @ for every integer. (By the way, the multiplicative identity is unique because 1 = 1-1’ = 1’.) 


We can read Landau's book, “Introduction to Arithmetic,” for more details. There are lots of gory details. 


Proposition 53 (Substitution principle) 
Let R be a ring, and let a € R. Then there exists a unique homomorphism © : R[x] > R such that © is the 


identity on constant polynomials and ®[x] = a. Basically, 


P(amx'™ +--+ + ag) = ama” +--+ 4+ ap. 


This is pretty easy to check: we are Just replacing x with a in every polynomial. 


Proposition 54 (Substitution, version 2) 


Let 6: R— R’ bearing homomorphism, and let a € R’. Then there exists a unique homomorphism 
®: R[x] 3 R’ 


such that © = @ on constant polynomials and (x) = a. 


This uses the same construction, except that we replace constants a, with #(an). 


Example 55 
If we take R=R and R’ =C, where @ is the inclusion of R into C, and we let a= 1+ /, the substitution map 


®:R[x] >C 


sends a polynomial f(x) to f(1+/). 


Example 56 


Let R= Zand R’ = F,, the field of p elements (or the integers mod p). If we choose a = 2 (which is the residue 


class of 2 mod p), then the map 
©: Z|x] > Fp 


sends amx™ +--++ a9 to am2™ +---+ a9 (mod p). 


As a related example, we can also define a map Z[x] — F,[x] by replacing am with am and keeping x. And in this 


case, note that we don't have to explicitly specify ¢, since there is a unique map from Z to Fp. 
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8 February 22, 2019 


Recall the definition of a ring: it has addition, subtraction, multiplication, and a multiplicative inverse 1. We also have 


rings of the form R[x], which are polynomials with coefficients in R. 


Fact 57 


It's important to note that x doesn’t take on any particular value: it is a variable, so it doesn’t satisfy any relations 


in our polynomial rings. 


Polynomials have a basis {1, x, x?, --- }. We also have a notion of a ring homomorphism 
o:ROR 
which is just compatible with both operations and sends $(1r) = 1p. 


Example 58 
There exists a unique homomorphism from Z to any ring R: it sends 1 to $(1), 2 to #(1) + d(1) and so on. We 
also have the “substitution principle:” if we have a ring homomorphism @ : R — R’, then there exists a unique 


ring homomorphism @ : R[x] > R’ such that © = @ on constants and P(x) =a. 


Now, let's introduce the concept of a kernel: it'll applies to the addition operation in our ring (rather than 


multiplication). 


Definition 59 
The kernel K of a homomorphism @: R > R’ is the set of a € R such that o(a) = 0’. 


The kernel is a normal subgroup of the additive group of R. But we're assuming commutativity of multiplication 
in all of our rings, so the fact that it is normal is trivial. 
We also have an interesting fact: if ae K,r € R, then ra € K, since o(ra) = d(r)¢(a) = o(r)0' = 0’. So the 


kernel is an example of something interesting that we don’t find in groups: 


Definition 60 
An ideal / of a ring R is a subgroup of (R,+) such thatae/,reR => rae. 


Example 61 


There's a unique map from Z — F,,: tt sends 1 to the residue of 1 mod p, and so on. Then the kernel of this 


map is pZ, so pZ is an ideal. (And we can check the definition to see that this is indeed true.) 


Example 62 
Consider the unique map Z[x] + C that sends x > 2+ / — what is its kernel? 


First of all, we can find a quadratic polynomial with root 2+ /, and that will be in the kernel. It’s 


f(x) = (x — (2+ /))(x — (2—1)) =x? — 4x +5. 
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We claim that the kernel of this map is just all polynomial multiples of f. To show this, let’s say g is in the kernel, 


so g(2+/) =0. Since f is monic, by the division algorithm, 
g=fqrr, 


where r has degree less than 2. g and fq are both in the kernel, so r must also be in the kernel. But there is no linear 


polynomial with integer coefficients with 2 +/ as a root, so we must have r = 0, and that means f divides g. 


Definition 63 


A principal ideal / of R is of the form | = Ra for some a € /. A principal ideal ring FR is a ring where every 


ideal is principal. 


All ideals we've discussed so far have been principal ideals, and that’s for a good reason: 


Proposition 64 
Z, F[x] for a field F, and Z[/] are all principal ideal rings. 


Proof. For Z, all subgroups under addition are of the form nZ or 0. Each of these is the principal ideal generated by 
nor O, respectively. 

For rings of the form F[x], let’s say we start with an ideal /. If it is just the zero ideal, we’re okay. Otherwise, 
there are some nonzero polynomials, and there exists a monic polynomial with minimal degree (since we have a field). 
Now we want to show / = fR. For any g € /, we can use the division algorithm to find that g = fq+r. fq,g € 1, so 
r €/ and therefore r = 0 (exactly analogous to we did above). 


Finally, for the Gaussian integers, the division algorithm works, so we can just start by picking a complex number 


in the ideal with minimal norm. 


Example 65 
Consider the map Z[x] — F, where we send x + 0. Then the kernel K of this map is the set of polynomials such 
that 

g(0) = 0 mod p. 


But notice that x and p are both in the kernel, and the only way to be a factor of both of these is to include 1 in 


the kernel. Clearly this is not true, so K is not a principal ideal of Z]x]. 


In general, we want to ask a question of “how many elements we need to generate /.” 


Definition 66 


A generator for an ideal / is a set of elements a1,--- , ax such that every element of / is a linear combination 


nmazyt---+ magn ER. 


This is notated as / = (ay,--- , Qx). 


So in the above case, x, p generate K, but there’s no way to have a single element generate K. 
So why is this object called an ideal? We'll start with a motivating example — consider the ring R = Z[V—5]. This 
ring is the set of complex numbers of the form {a+ b/—5 | a,b € Z}. 
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Fact 67 


This ring doesn’t have unique factorization! In particular, 


6=2-3=(1+vV-—5)(1— V-5), 


and there’s no way to resolve these two factorizations directly. 


Number theorists didn’t like this: somehow Gaussian integers have unique factorization, but this “weird ring” didn't. 
So they introduced the concept of an “ideal element,” which “ideally should be there.” 

The idea is that we can factor with ideals instead of elements! So in this ring, let A be the ideal generated by 
(2,1 ++/—5). Notice that in the complex plane, we have a rectangular grid of elements in our ring, so we can get a 
geometric sense of what's going on. For notation’s sake, let 6 = /—5. A is closed under complex conjugation, since 
1—6 =2—-—(1+46) Is in our ring as well. 

Now define B = (3,1+6) =» B=(3,1—6). Let's try to multiply some ideals together: 


Definition 68 
The product ideal AB is the set of finite sums of ajG; where a; € A,B; € B. In other words, AB ts generated 


by {ajG;}. 


In this case, AB has four generators (taking one generator from each of A and B): (6,2 + 26,3 + 36, (1 + 6)?). 
Then (3 + 36) — (2+ 26) = 1+ 6 is in AB as well, but notice that this divides all four generators! This means that 
(1+ 6) CAB C (14+), so AB = (1+ 6) is a principal ideal. 

Similarly, we can find that AB = (1+ 6) = (1— 4). Furthermore, 


(6) = (1+6)(1— 5) = ABAB. 


So now 
AA = (4,2 — 26,2 + 26, 6) 


contains 6 — 4 = 2 and 2 divides everything, so AA = (2). Similarly, BB = (3), and now 
AABB = (2)(3) = (6). 


So we've found a better factorization of 6 in our ring. The central idea here is that we've replaced numbers with 
ideals, and we've resolved the two different methods of factorization. This concept of replacing elements with ideals 


doesn't always work, but it works in a lot of cases! 


9 February 25, 2019 


Recall the idea of a quotient group from 18.701: if N is a normal subgroup of G, then the quotient group G = G/N 


iS a group with cosets as elements: @= aN, and multiplication is defined as 
(aN)(bN) = abN. 


We'll similarly define the concept of a quotient ring now: 
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Definition 69 
Given an ideal / of R, which is closed under addition by / and multiplication by R, define the quotient ring R// 


to be the set of additive cosets (a+ /). The operations are 
(a+/)+(b+/)=(a4+5)+], 


(a+/)(b+1)=ab+I. 


This multiplication definition makes sense, since if we have x, y € /, then 


(a+x)(b+y) = ab+ ay + bx+xy = ab+ (ay + bx + xy) € abt. 


Example 70 
Let R=Z and /=8Z. Then 


NOB De AAs Se ae Th 


But this is clearly not an equality: all elements of (2+ /)(2+1/) are 4 mod 16, so we never hit 12 (for example). 


Notation-wise, we'll write the quotient ring as R// = R and denote the coset a+/ as a. However, note that we 


can write the same coset with various as, Just like we could write the same coset gH with various gs. 


Example 71 
li RZ and f= pZ, then R/} = Ris Bp. 


It's probably good to check the ring axioms to ensure that quotient rings are actually rings. To do that, notice 
that there is a map 
wn: R3R=R/I 


which sends a @=a+/. This is a surjective homomorphism, so the ring axioms follow from the axioms for R. 


Theorem 72 (First lsomorphism Theorem) 
Let 6: R — R’ be a surjective ring homomorphism with kernel K = ker ¢@. Then R’ is isomorphic to R/K, and 


there exists a unique isomorphism @ such that oa = ¢. 


eae 
é 


as 


Proof. This is a similar proof to the one for groups. Define 6(a) = ¢(a). First, we need to check that this definition is 
consistent (since different a in the same coset @ should have the same value of $(a)): if a’ = a for two elements a, a’, 
then a’ = a+ x for some x € K. Thus, (a) = (a+ x) = O(a) + O(x), and since x is in the kernel, d(a’) = (a), 
and our function is well-defined. By definition, @ is then surjective because @ is surjective. 


Now if a € ker@, then (a) = 0. By our earlier definition, this means ¢(a) = 0, so a € K. Thus a= 0, so the 


kernel of @ is trivial, and @ is injective, as desired. 
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Let's think about the case where / is a principal ideal: denote this ideal as / = aR = (a). Then 
R/I=R 


is a set of cosets, and since a = 0, we can think of this as a ring obtained from R by forcing a = 0. This is similar to 
adding a relation in our ring R: for any fF € R, fa=0, and b+ 7a= b. And in the principal ideal case, these are the 


only consequences of setting a= 0. 


Example 73 


Similarly, we can set x* + 1 = 0 in R = R[x]. Then the ideal (x? + 1) is the kernel of the homomorphism 


@: R[x] 3 C 
that sends f(x) to f(/). Thus, by the First Isomorphism Theorem, 


C = R[x] /(x? +1). 


Fact 74 


This is one way to think about the notation for the Gaussian integers: after all, 


Z{i] = Z[x]/(x? + 1). 


However, most of the time, a quotient ring is not as recognizable as Z|] or C. 
From this, we also get the concept of adjoining a new element to a ring R. Given R, we add a new element 
a with some additional properties. Basically, start with the polynomial ring R[x] (for some new element x not in R), 


and then mod out by all of the relevant relations f(x) = 0. (This is how we constructed C above.) 


Example 75 


Take our ring R = R[t], and let’s adjoin an element a@ which is a root of the polynomial 


f=x ix Pre Rix| Rie x1. 


The solution is to force x? + tx + t = 0 in R[x], and then we can define a = X to be the residue of x in the 
quotient ring R[x]/f. How do we compute (for example, multiply elements) in R[a]? Any polynomial g(x) becomes 


an element g(a) € R[a]. Since f is monic in x, we can write 
Ga fh, 


where r has degree less than 3 in x. Since f(a) = 0, we must have g(a) = r(q@), and therefore r(x) is a quadratic 


polynomial with elements in R. This is as much as we can simplify, so (1, x, x2) is an R-basis for R[x]/f. 


Fact 76 


It's a bit more work to do this adjoining process when f is not monic, though — we won't talk much about it. 


1 


The next thing we can try to do is to adjoin a~! to R. Then a7! is a root of ax — 1, so 


R[a~*] = R[x] /(ax — 1). 
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So for example, C[t, t7+] = C[t, x]/(tx — 1), which just gives us polynomials in t and t~?. This is called the Laurent 


polynomial ring, and it's used to build Laurent series in complex analysis. The elements of this ring look like 


N 
S- gt, geéeR. 
j=-N 


Keep in mind, though, that this isn’t necessarily a field! And in general, if we want to adjoin infinitely many things, 


this isn’t the best way to do it. 


10 February 27, 2019 


We'll start with a motivating question: let R be a ring and F be a field. Under what conditions does there exist an 
injective 6: R- F? 


Definition 77 


An (integral) domain is a nonzero ring such that for any two nonzero elements a, b € R, we have ab # 0 (there 


are no “zero divisors”). 


Examples include fields and the ring of integers. 


Lemma 78 (Cancellation law) 
If a#0 and ab= ac ina domain R, then b=c. 


This is true because 


ab=ac => a(b-—c)=0 b-—c=0, 


since a #0 and it’s not possible (in a domain) for two nonzero things to multiply to 0. 


Proposition 79 


If R is a domain, there exists an injective homomorphism R —- F for some field F. 


Proof. We consider the fraction field of the domain R, which is the set of equivalence classes 


{Z.a.beR,b #0} 


where ¢ = § if and only if ad = bc. 


We can check silly things like associativity of addition in a fraction field, which is completely routine. It's important 


to note transitivity, though: if 2 = § and § = §, then ¢ = §. This is because 


ad=bc, cf =de bcf = bde daf = deb af =eb 


by the cancellation law (note that we can only cancel by b and d since we know those are nonzero). Now just send 


any a € R to ¢ in the fraction field, and we have an injective map as desired. | 
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Proposition 80 


Let R be a domain, F bea field, and let @: R — F be an injective homomorphism. Then @ extends to an injective 


homomorphism 
d’ : fract(R) 4 F 


where ¢’ (2) = $(b)~1¢(a). 


We can check that this is compatible, and that’s more of an annoying checklist. Here’s something that’s a little 


less simple: how do we deal with surjectivity? 


Theorem 81 (Correspondence Theorem for rings) 
Given a surjective homomorphism @ : R > R’ with kernel K, there is a bijective correspondence between ideals of 
R containing K and ideals of R’. Basically, any ideal / in R corresponds to its image $(/), and any ideal J in R’ 


corresponds to its preimage @ +(J). 


This is important, because fields have very few ideals: 


Lemma 82 


The only ideals of a field F are the whole field and the zero ideal. 


Proof. |f there are no nonzero elements in an ideal, then we just have the zero ideal. Otherwise, ifa#0¢€/, aat=1 


is in /, and then / = F is the whole field. This is called the unit ideal, and it is often denoted (1). 


This means that there can't be that many ideals that contain the kernel. If @¢: R > F is a surjective homomorphism 
to a field F with kernel M, then R’ is the field F in the Correspondence Theorem. Since the only ideals that contain 


M are M,R, there are no proper ideals containing M. 


Definition 83 


A maximal ideal M of a ring R is a proper ideal (not equal to R), such that there is no ideal / satisfying 


MGICR. 


Basically, “we can't get any larger and still have something interesting.” And the above logic gives us the following 


result: 


Corollary 84 


The kernel of a surjective homomorphism ¢@: R —- F is a maximal ideal. 


Let's also show the “other direction:” 


Lemma 85 


Let M be a maximal ideal of a ring R. Then F = R/M is a field. 


Proof. We need to show that F is not the zero ring, and that inverses exist: aAO0E€ F = a lteF. 
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The first condition is true because M is not R. For the second, it’s sufficient to show that a field is just any ring 
with exactly two ideals, since the only ideals containing M are M and R, which corresponds to a field for R/M by the 
Correspondence theorem. 

Fields are not the zero ring, because that only has one ideal. Now, let’s say a € R, and we take the ideal of 
multiples of a. Since a € (a), (a) £ (0), so (a) = R = > 1€ (a). This means that b = a7 exists in R, so R must 
be a field. 


Example 86 


Consider R = Q[x], where Q is the rationals. Take the principal ideal M generated by f(x) = x? — 2. 


Why does this generate a maximal ideal? Every ideal of R is principal (this hasn’t been proved in class yet, but it’s 
the Division Algorithm again), so if (f) C / and / = (p) for some p € R, then we can write f = pq for some q € R, 
meaning that p divides f. But f has no factors, since it would need to expand into a quadratic and a linear term, and 
V2 is not rational. So R/(f) must be a field. We can use the map 


¢:R=Q>x] -C 


which sends x + \/2 (a rootroot of x? — 2), so that 6(x?— 2) = 0. This means that the kernel of @ contains the ideal 
generated by (x? — 2), which a maximal ideal, so the kernel is (x? — 2). Thus, by the First Isomorphism Theorem, 
The image of ¢ is isomorphic to R/(x? — 2): this is Q[/2] with a Q-basis of (1, a, a). 

It isn’t immediately obvious that this is a field, but we can check with some computations that it is! Note that 
the other roots of x? — 2 are aw and aw’, where w is a third root of unity. So now if we consider an alternate map 
¢' : Qix] — C where we send x —> wa instead, we still send x? — 2 — 0, but we find now that the image of ¢’ is 
isomorphic to R/(x? — 2) as well. 

So Q[a’] and Q[a] are isomorphic. One is a subfield of the reals, and the other is not, but the two rings are still 


isomorphic! So we can’t tell just by looking at the structure of the ring whether the element a that we adjoin is real. 
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The main topic of today is the Nullstellensatz. 


Fact 87 


In German, “nullstellensatz’ means “zero place theorem.” 


Recall that a maximal ideal M of R is an ideal M < R such that there exist no ideals / with M</1< R. We 
found last time that M being maximal is equivalent to R/M being a field. In other words, if @: R > F is a surjective 
ring homomorphism, then F is a field if and only if the kernel of ¢ is a maximal ideal. This is because a field has only 


two ideals: 0 and the whole field. 


Example 88 
Let R = C[x]. Since every ideal of R is principal, for any ideals /, there exists an f € / such that every g € /, 


g = fq. Then / is denoted (f), Rf, or fR: it’s the principal ideal generated by f. 
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In particular, we can use the division algorithm to find the polynomial with minimal degree in /: that’s the one that 


generates /. 
Question 89. When is an ideal (g) maximal in C[x]? 


If another proper ideal is larger — that is, (fF) D (g) — then g € (f) = g=fq. So g is maximal if there is no 
such f, meaning g should be irreducible in C[x]. But this happens if and only if the degree is 1 (by the Fundamental 


Theorem of Algebra), so we have the following result: 


Corollary 90 


The maximal ideals of R = C[x] correspond to elements a € C, and they can be written as R/(x — a). 


We can think of such maximal ideals as corresponding to the kernels of homomorphisms 


C[x] > C,x > a. 


Now let’s move on, and let R = C[x1, X2,--+ , X,| be the ring of polynomials in more variables. Now ideals look more 
complicated: we can't say very much about them at this point. We should remember that a1, Q2,--: ,@, generate 
an ideal / if 


1={Snailne rR}. 


Basically, we take all R-linear combinations of our generating set. Unfortunately, there’s no bound for the number 
of generators of an ideal for R = C[x1,--- , xX,]. Sometimes, we can even have an infinite number of generators! But 


here's something useful to help us: 


Theorem 91 (Hilbert Basis Theorem) 


Every ideal / of the polynomial ring C[x,,--- , x,] can be generated by finitely many elements. 


(We'll prove this later on in the class in an alternate form.) With that, it’s time for a big result which was published 
along with the Hilbert Basis Theorem in 1895: 


Theorem 92 (Hilbert Nullstellensatz) 


The maximal ideals of a ring R = C[x1,--- , X,] correspond to points a = (a1,--: , an) € C”. 


This is the analog of the 1-dimensional theorem. Basically, given a point a € C”, the maximal ideal is the kernel of 
the homomorphism 


b: Cx, +++ Xp] 2 C, xj) > 97. 


This homomorphism sends any polynomial g(x1,--: ,Xn,) to g(a1,--: , an). Indeed, the kernel here is a maximal ideal 
(if some other element f were in the ideal, we could use the division algorithm to get a constant, and that’s either 0 
or it generates the whole ring). 

All such ideals are generated by {x, — a1, X2 — a2,-°-: ,X_, — an}. This means that given a polynomial g(x) with 


g(a) = 0, we can write (not necessarily uniquely) 


g=(% a1)N Peery (Xn an) In- 


One way to show this is to do a Taylor expansion: 
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Fact 93 (Taylor expansion in multiple variables) 


If we've taken 18.02, we know that we can write polynomials as 


2) 
g(x) = 9(a)+ Oi a) (a) + 5 Dw aly - age) + 


and now put each term into one of the (x; — a;)s as we wish. 


So this gives us something we can say about any ideal: 


Corollary 94 


Every ideal / of C[x,,--+ , X,] with | < R is contained in some maximal ideal. 


Proof. |f | is maximal, we're good. Otherwise, / < /, < R; if /, is maximal, we're also good. Otherwise, I, < lo < R, 
and keep going. This gives us a chain 
l=Ib<h<Ilb<:-- 


To finish, we want to show that this chain can't be infinite: 


Lemma 95 


Given a chain Io C /y C lp C--- inaring R, the union / =(J/, is an ideal. 


We just need to show the two closure assumptions. If a, b are in /, the union of all /,, then a € /, and b € I, for 
some x, y. Then both are in Imax(x,y), SO 2+ DIS IN Imax(x,y)- £4 € Imaxry as well, so both of these are also in /. 

So if our chain is infinite, J = LJ /, is an ideal. By the Hilbert Basis Theorem, J is generated by a finite number of 
elements. Let's call them b,,--- ,6,-. Then for some n large enough, all 6; € /,;: now JC /,, so J = Ip. All /,S are 


proper (not R), so J is also proper. This means our chain does stop eventually, and we do have a maximal ideal at 


the end of our chain. 


Example 96 


Take our ring to be C[x, y]. Let / be an ideal generated by x? + y* —1,xy —1, and y — x?. 


We claim / = R, and we'll show this by showing that it is not contained in a maximal ideal. All maximal ideals 


are kernels of substitutions, so if they were in a maximal ideal, there would be a point such that all three polynomials 
3 


would evaluate to zero. This requires xy ly=x x4 1, and x?+y? =1 => x*++1 = x? (since 


x?y? = 1). This implies x? = 2, but x* = 1. So there is no point that is a zero of all three polynomials: they must 
therefore generate the unit ideal. 


Let's go over the proof of the Hilbert Nullstellensatz (Theorem 92): 


Proof. Start with an unknown maximal ideal M of R = C[xi,--- ,X,]. Then F = R/M is a field, and we have a 
surjective homomorphism a: R — F. Restrict m to the subring C[x,]; call this map ¢;. All ideals in C[x;] are 
principal, and @; sends these elements into a field F, so ker 6; must look like (x; — a1) (the ideal generated by a linear 
polynomial) or just the zero ideal. The first case is good, since us being able to do this for each x; gives our generators. 
So we have to rule out the case where ker @y is trivial. 

If @; has trivial kernel, then @; is injective. We can therefore extend ¢; to the fraction field C(x,) — F (the round 


bracket notation means fractions of polynomials). As a complex vector space, F is spanned by a countable set, since 
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R is spanned by monomials xj1---x$" and we have a surjective map. But the field of fractions C(x) contains an 


1 
{ cech, 
Xt — C 


because it’s not possible to have nonzero b; with 


uncountable independent set 


bj 
> =0 
“xj — CG 
if we look locally near c;, where there's a pole. (It’s important to note that any linear combination only uses a finite 
number of the spanning elements.) If a vector space is spanned by a countable set, every independent set is countable 


or finite, which contradicts C(x) being a subring of F (since we assumed injectivity). Thus all ¢; have kernel equal to 


(x; — a;), and thus we've found generators for our maximal ideal. 


We have a quiz on Monday in Walker. Our TA will be at a meeting, so it is unclear that quizzes can be given back 


on Wednesday. 
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(The exam 1 average was above 80, so we all get cookies.) 

Our new topic for this class will be factoring and irreducibility. Let’s start with polynomials R = F[x], where F 
is a field. Sometimes, like in the rationals, it’s possible to have irreducible polynomials of all degrees. 

We usually have two main tools for factoring: we can often use division with remainder, and we can use the fact 
that R is a principal ideal domain. 


In general, we want to work with monic polynomials 
F(x) =x" + apax”! es + ag, a7 € F, 
and we can turn f into this form by multiplying by the inverse of the leading coefficient. 


Definition 97 


Assume f,p are monic. An irreducible polynomial f is a polynomial where f 4 1 and it can't be factored as 


f = gh with g, h not units (in this case, this means g, h have degree at least 1). A polynomial p is prime if p41 


and for all f, g, if p divides fg, then p divides either f or g. 


Lemma 98 


Every irreducible element in a principal ideal domain is a prime element. 


Note that this result is not true for all rings: 


Example 99 
Let R = Z[6], where 6 = /—5. Note that 2 divides (1 +6)(1 — 6) = 6, but 2 does not divide 1+6 or 1-6. 


However, 2 is irreducible, because 2 has smaller absolute value than all other elements besides 1. 
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Proof of lemma. Let p be an irreducible element, and let's say that p divides fg but p does not divide f. Let / be the 
ideal generated by p and f: 
/=(p,f)={rpt+sf |rs€ R}. 


Note that / > (p) (this is strictly greater since f is not in the ideal generated by p), and since / is principal, / = (d) 
for some element d. So (d) > (p), meaning that d|p. But p was irreducible, so either d = 1 or d= p. 

We can't have d = p, since we assumed (d) was strictly larger than (p), so d = (1). This means 1 = rp + sf for 
some r,s € R. But then 


l=rp+sf => g=r(p)g+s(fg), 


and p divides both terms on the right side, so p divides g. 


Corollary 100 


For any field F, all monic polynomials f 4 1 in F[x] are a product of irreducible elements, with uniqueness up to 


ordering. 


We can also make a similar argument about unique factorization in Zi], except for unit factors like +1,+/. Let’s 


be a bit more precise about that: 


Definition 101 
A unit in a ring R is an element that is invertible. Two elements a,b € R are associate if we can write b = ua 


for some unit u € R. 


Example 102 
In Z[/], 1+ 3/ is associate to —1 — 3/, -3+/, and 3—/. 


So the idea is that two factorizations of a Gaussian integer can look different: we can multiply one factor by / and 


another by —/. 


Example 103 


Let's factor 1 + 3/ in the Gaussian integers. 


Start by multiplying it by its conjugate to get a norm 
(1+ 3/)(1-3/)=10=2-5. 


Now we can factor the two terms on the right: 2 factors as (1 +/)(1 —/), and 5 factors as (2+ /)(2—/). So 


(143)1-3) =(14)0-)2+)2-d. 


Now 1+ 3/ is “probably” a product of some of them: we'll try a few different things. (1+/)(1+2/) = —1+3), which 


doesn't work. But (1 — /)(2+/) = 3-1 is an associate of 1 + 3/, so we've found our factorization. 
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Theorem 104 


For any principal ideal domain R, all nonzero, non-unit a € R are products of prime (or irreducible) elements 


a = PiP2°°* Pk. 


This is unique in the sense that if a = qiq2---q), we can reorder the qjs, and then all pjs are associates of 


corresponding qjs. 


One noteworthy idea here: factorization will always terminate in a principal ideal domain. 


Proof. The idea is that if we start with a nonzero and not a unit, then either it is irreducible (in which case we're 


done), or a = a,b,. In the second case, we know that 


(a) < (a), 


since a doesn't divide a; if b is a unit. Then either a; is irreducible (and we're done) or a; = a2b2, and so on. This 
creates a similar infinite chain 
(4) < (a1) < (aa) < + 


and we can't keep doing this, because the union of an increasing family of ideals is an ideal by Lemma 95. 
So J = U(a;) is an ideal, and it is principal (because R is a principal ideal domain), so J = (d) for some d. d is 
in the union of the ideals, so it is in some (a;). Then (a) C (d) C (a), meaning a; is the largest our ideals get. And 


thus we can't get an infinite chain, and therefore we can indeed write a as a product of prime elements. 


Showing uniqueness comes from each p; needing to divide a qj and vice versa. 


But the converse is not true: not all rings with unique factorization are principal ideal domains. Let's stop 


working with monic polynomials now and look in Z[x]. An example of a factorization in Z[x] is 
f(x) = 2(x? + 1)(5x — 3). 


There are two complications compared to the F[x] case: sometimes, we can’t always have monic polynomials, and we 
may have a constant leading term as well. 
What tools do we have to deal with this? First of all, Z[x] is a subset of Q[x], the ring of polynomials with rational 


coefficients, so we can use some properties of Q[x] to help us out. Also, we have the ring homomorphism 
Tp : Z[x] > Fp[x] 


which sends f(x) — mp(f) = f(x). If we have a polynomial that we can't factor in Z[x], we want to show it is 
irreducible: perhaps we can show that it is irreducible in F,,[x] instead, which is easier since there are only finitely many 
of any given degree! In particular, we can try all pairs of polynomials and see if any of them multiply to our polynomial 


f. 


Example 105 


Consider the polynomials in F5[x]. The first few are 


lees x ox 
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To find the irreducible polynomials, we can do something similar to the Sieve of Eratosthenes. x?, x2 + x are 
factorable, and x? + 1 = x? — 1 are factorable as well. x? + x + 1 has no roots, so it is irreducible. For cubics, their 
factors must be a linear times a quadratic polynomial, so we just check all polynomials of the form x? + ax? + bx +1 


(since we don’t want x to be a factor) and see if x + 1 is a root: we see that x? + x + 1,x?+ x? +1 are irreducible. 


Example 106 


A polynomial like x? + 2x? + 7x + 15 is irreducible in the integers, because it is irreducible in F2[x]. 


Definition 107 


A polynomial f(x) € Z[x] is primitive if the greatest common divisor of its coefficients is 1. 


In other words, no prime divides all coefficients of f, so 1,(f) # 0 for all primes p. 


Theorem 108 (Gauss’ Lemma) 


If f, g are primitive polynomials in Z[x], then fg is primitive. 


Proof. It suffices to show that m,(fg) 4 0 for all primes p. Since m, is a homomorphism, mfg) = m)(f)mp(g), and 
both f and g have a nonzero leading term, so their product is not zero. (Alternatively, F[x] is a domain for any field 
F.) Thus m,(fg) 4 0. 
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We'll be talking more about factoring in the ring of polynomials Z[x] today. Recall that we have two main tools: 
one is to view polynomials in Z[x] as polynomials in Q[x], which is easier because Q|x] is a principal ideal domain. 
Alternatively, we can look at the homomorphism 7p : Z[x] — F,[x], which reduces the coefficients mod p. 

Last time, we defined a primitive polynomial f(x) € Z|x] to be a nonconstant polynomial with integer coefficients, 
where the greatest common divisor of its coefficients is 1. (The leading coefficient has to be positive.) Theorem 108 
(Gauss’ Lemma) tells us that if f, g are primitive polynomials, then fg is primitive as well. This has the following useful 


corollray: 


Corollary 109 


Let f be a primitive polynomial. If g = fg, and q has rational coefficients, then q € Z|x]. 


So if f is a primitive polynomial, and f divides g in Q|[x], then f divides g in Z[x] as well! 


Proof. Clear the denominators, so q; = dq, where d is the constant with smallest absolute value needed to make qy, 
have integer coefficients. Note that this means qj is primitive. 


Now dg = dfq = fq, and by the Gauss Lemma, fqj is primitive. But now the only way for fq; = dg to be 


primitive is if the constant factor d is +1 — otherwise, g's coefficients would have a common factor. Thus q must 


have started out with integer coefficients as well. 
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Corollary 110 


If g is irreducible in Z[x], then q is irreducible in Q[x] as well. 


Proof. This looks very similar to the above argument. We show the contrapositive: if g = fg, where f, q € Q|x], then 


g factors in Z[x] as well. Move the constants around so that f is primitive (and has integer coefficients). Now pick 


the constant c such that g, = cq (like above). Now cg = cfq = fq, but f and q; are primitive so c = +1, and thus 


q must have had integer coefficients as well. 


Corollary 111 


The irreducible elements of Z[x] are + integer primes and + irreducible polynomials. Thus, the irreducible 


elements of Z[x] are prime elements. 


Remember that f is prime if f|gh implies that f|g or f|h. 


Proof. Suppose f is an irreducible polynomial. Then if f|gh in Z[x], then f|gh in Q|x]. But irreducibles are primes in 


a principal ideal domain, so f|g or f\h in Q[x], and therefore f|g or f|h in Z[x], and thus f is prime. The converse 


(that integer primes and irreducible polynomials are irreducible) is easy to show. 


With all of this, we’ve shown the following result (essentially be swtiching back and forth between Q[x] and Z[x]): 


Theorem 112 


The polynomial ring Z[x] has unique factorization into prime polynomials. Thus, any g(x) can be written as 


g=+pi--- pKhi--+ fk 


where pj; are integer primes and fj are primitive polynomials. 


Let’s shift to our other tool now: using the homomorphism 7p. If f factors in Z[x], it will factor in F,[x] as well: 
take f = gh to f = gh. This breaks down if the leading coefficient of f is divisible by p though, and we should always 


pick a good prime so that we don’t have g = 1. 


Example 113 


Consider the polynomial f(x) = 5x° + 4x4 + 3x? + 2x? +x —1. Then f = x°+x3+x-+1. Notice that x+1isa 


factor of this, since x = —1 Is a root of f: therefore, our polynomial factors as 
fae 1) ae" 44), 


where we can check that the second factor is irreducible. So if f factors, it must look like the product of a linear and 


quartic polynomial. (Note that F2[x] has unique factorization.) The linear polynomial must look like (ax + b), where 


bis a factor of 1 and ais a factor of 5 (by the Rational Root Theorem). So now we just check +1, +3: none of these 


are roots of f, so our polynomial is irreducible in Z[x]. 
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Fact 114 


Computers can do this in “zero” time. But it’s very painful to try to do it mod powers of 2. It’s actually a linear 


algebra problem over F>, but this is generally not suitable for doing in class (we shouldn’t try it). 


Now we'll demonstrate another useful tool: 


Example 115 


Consider f(x) = x° + 3x* — 6x? + 9x? — 3. Is this irreducible in Z[x] 


Take prime p = 3. Our polynomial looks like f = x°, so f = gh = x®. Let's say g has degree 2 and h has degree 
3: note that g = x* and h = x? works, and by unique factorization this is the only way to multiply a quadratic and 
cubic polynomial. So now 
g=x? + byxt bo, h=x2 4+ Ox? +4x4+o, 


where 3|bo, Co. But this implies 9|byco = —3, which is not true! So this is not a valid factorization for the original 
f(x). We can repeat this argument in general for any degrees g, h: this means f is irreducible. 


Here's a way to state this in more generality: 


Theorem 116 (Eisenstein’s criterion) 


Suppose we have a polynomial f € Z[x] of the form 


f(x) = apx” +--+ + ap. 


If a prime p divides all coefficients except the leading an, and p? does not divide ao, then f is irreducible. 


Proof. Repeat the process we used above in the general case. 


We'll finish by talking about cyclotomic polynomials. Consider the pth roots of unity Ck, where ¢ = e27//?. All of 
these are roots of the polynomial x? — 1, but there’s the trivial root x = 1. So we want the polynomial with roots 


CCF s5 Cre 


Definition 117 
This polynomial 


xP—1 


<—— is called the pth cyclotomic polynomial. In general, the nth cyclotomic polynomial ®,(x) 


comes from multiplying (x — *) for all primitive nth roots of unity. 


Theorem 118 


The polynomial x°-! + --- +x +1 is irreducible in Q[x] for any prime p. 


Proof. Make a change of variables: let x = y+ 1. Then 


F(x) = fy +1) = ae = 
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can be expanded by the binomial theorem: 


varpayes (Pyetana (oP yas, 


and all coefficients except the first and last have exactly 1 factor of p in them. Subtract 1 and divide by y, and now 


must be irreducible by Eisenstein’s criterion. 


Corollary 119 


Since Cp, is a root of ©,(x), 


F(Q=C P+ +C+1=0 


is the only linear (rational) relation among powers of ¢. In other words, {¢, ¢7,--- ,C?~+} are independent over 


Q. 


Proof. Consider the map ¢ : Q[x] — C that sends x — ¢. By definition, the cyclotomic polynomial ®, is in the kernel. 
ker @ is a principal ideal, and it is generated by the minimal polynomial with ¢ as a root. This is ®,, because it is 
irreducible and has ¢ as a root! So the kernel of @ is generated by f, and by the first isomorphism theorem, Q|x]/(f) 
is isomorphic to the image Q[¢]. Since f has degree p — 1, the image must have dimension p — 1, and the residues 


1,C,---CP-! form a basis. 


14. March 11, 2019 


We're going to shift topic to doing arithmetic in imaginary quadratic fields. We'll start with R = Z[/], which is easy 
to deal with because we have a principal ideal domain. 


Remember that the explicit definition of the Gaussian integers Z[/] is 


{a+ bi | a,b € Z}. 


Definition 120 


The norm of a Gaussian integer a = a+ bi is 


N(a) = @a = |a?| = a? + B?. 


This has two useful properties: N(qa) is multiplicative, and it is always a positive integer. Since every ideal of Z[/] 
is principal, irreducible elements are prime elements. Remember the definitions here: if @ is irreducible, then a # 0, 
a@ is not a unit, and it has no proper factors. Meanwhile, a is prime if tlab = > ala or m\b. Note that the ideal 


generated by a prime element 7, 7R, is maximal, so R = R/(mR) is a field. And Z[i] has unique factorization into 


Gauss primes (primes in Z[/]), up to associates, which differ by one of the units {+1, +/}. So, for example, unique 


factorization treats 


(14+)04+2/) =(1-/)(-2+/ =-143) 


as equivalent factoring. 
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Proposition 121 


Let m be a Gauss prime. Then N(m) = a* + b? is either a prime integer p or the square of a prime integer p°. 


Additionally, this implies that if (a) = 7m = p*, then 1 and p are associates (by unique factorization). 
Proof. We know that 77 is an integer, so we can factor it in Z as 
TH = P1*** Pk. 


But this is also a valid equation in the Gaussian integers. Since m is a Gauss prime, so is 7 (just take conjugates), and 
now Pp; --- Px can only have at most two terms! So either #2 = p for some prime or 77 = p?. 
In the last case, p, and p2 must be associates of m and 7. Say that p; and m are associates. Then p> and 7 are 


associates, but we also know that 7 and py = p; are associates. So p; and po are associates, which means p; = Po 


(in order for their product to be a positive integer), and this must just be p*, as desired. 


Corollary 122 


If p is an integer prime, we can factor it in Z[/], and it factors either as p or 77, where m is a Gauss prime. 


Proof. Say we have an integer prime p. Factor it into 
p=T1°:°: Tk. 


Take the norm of both sides: then 


p° = pp = (M1) (Mame) - ++ (TET). 


Each parenthetical term is an integer, and it is more than 1. By unique factorization of integers, these sides must be 
the same, so k < 2. If k = 1, we just have 

pPp=™7™, 
and by uniqueness in the Gaussian integers, this means we must have p = 7, meaning p is just a Gauss prime (or its 
associate). Meanwhile, if k = 2, 


Pp” = pp = (™™) (M272) 


and similarly we must have p = 7171 = ToT. 


Example 123 
We know that 2 = (1+/)(1—/), so1+/,1—/ are both Gauss primes. We can't write 3 as a product of two 


Gaussian integers with smaller norm, so 3 is a Gauss prime. Finally, 5 = (24+ /)(2—/), and 2+ /,2—/ are both 


Gauss primes. 


We say that 2 and 5 split in Z[/], but 3 remains prime. 
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Theorem 124 


Let p > 2 be an odd prime. The following are equivalent: 


1. pis a Gauss prime. 
2. x? +1 is irreducible in F,(x): that is, —1 is not a square mod p. 
3. pis not a sum of two integer squares. 


4. pis 3 mod 4. 


Proof that (1) and (2) are equivalent. Remember that given a Gauss prime 7, (7R) is a maximal ideal, so R = 
R/(aR) is a field. 


We can think of the Gaussian integers as the image 
@:Z[x] ~ Zp] =R 


under the substitution x + 7. Compose this with a map from R to R = R/pR: this first kills x? + 1, and then it kills 
p. 

Alternatively, we can start with Z[x], kill p to get to S = F,[x], and then we can substitute / (by modding out by 
the polynomial x? + 1) to get S = S/((x? +1)S). This kills p first and then x? +1. 

But notice that R and S are isomorphic, so they are fields for the same values of p. So R is a field if and only 


if p is a Gauss prime, and because F,[x] is a principal ideal domain, S is a field if and only if x? + 1 is an irreducible 


polynomial in F,[x]. This is exactly the correspondence that we want. 


Let's look a bit more carefully at the structure here and justify the idea of R and S being isomorphic: 


Lemma 125 
Let u: R— R’ and v: R’ — R" be surjective homomorphisms, and let w = vo u. (Assume we have a principal 


ideal domain to make notation easier.) Let ker u = xR and ker v = y’R’, and let y € R be the element such that 


uy = y’ (which exists by surjectivity). Then 


kerw = (x, y)R. 


Then by the first isomorphism theorem, we know that 
R" = R/ker w, 
so our logic above was valid (because (x, y)R and (y,x)R are the same thing). 


Proof. Let a € kerw. Then ua = a’ must be in the kernel of v, so a’ = r’y’ for some r’ € R’. Since u is surjective, 


choose r € R such that ur =r’. Then let b= a— ry; notice that 
ub=ua-—ury=a —ry' =0, 
so ub € ker u, meaning that b= sx for some s € R. Therefore 


a=b+ry=sx+ry, 


as desired — we've shown that a € (x, y)R. (And clearly x, y are both in the kernel, so we do need to include both.) 
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Let’s finish the proof of Theorem 124: 


Proof, continued. Let’s show that the first and third conditions are equivalent. If p is a sum of two squares, we can 
write it as 
peer =CG+hiG@=—bi), 


meaning it is not a Gauss prime. And this logic works in reverse — if p is not a sum of two squares, we cannot factor 
it in the Gaussian integers. 

Finally, to show that the second and fourth conditions are equivalent, x? + 1 is irreducible if and only if —1 is not 
a square mod p. We map the multiplicative group to itself: consider the map ¢: F> — FZ sending a> a’. Since we 


have an abelian group here, this is a homomorphism, and the kernel is {+1}. Thus the order of the image H is — 


if p is 3 mod 4, then the order of H is odd, so there are no elements of order 2. But if p is 1 mod 4, then there is an 


element of order 2, which must be —1 (this is the only one)! 


So —1 is in the image if and only if it is a square, which happens if and only if p is 1 mod 4. 


The next case we can look at is Z[,/—5] — the main difference is that ideals are no longer principal. 


15 March 13, 2019 


Let's review some concepts about R = Z[/]. This is a principal ideal domain, and therefore it’s also a unique 
factorization domain. 

Take A to be an ideal of R which is not the zero ideal. Let a € A be a nonzero element with minimal norm: we can 
show that A is the principal ideal @R with the Division Algorithm. What does this ideal look like? We get translates 


of a and ai, so we have a square grid of sidelength |a]. 


Fact 126 


The idea is that any element G € A can't be inside a square, as it'd be too close to one of the adjacent a-points. 


We can see this by drawing circles of radius |a@| from opposite corners of a square: this covers the whole square, 


so GB must be a multiple of a. 


We're going to use this same reasoning in another case now. 


Proposition 127 


R = Z|[V—2] is a unique factorization domain. 


Proof. Take any ideal A. Like before, choose a 4 0 with minimal norm. Now the ideal generated by @ looks a little 
different: we have a rectangular grid with side lengths in the ratio 1 : 2 in the complex plane. 


But the same tiling argument works here! Let’s say G not in A = QR is in the ideal: we know 6 can’t be within 


|| of any of the vertices of the rectangle. Again, the union of the circles completely covers the rectangle. 


So Z[,/—2] is both a PID (principal ideal domain) and a UFD (unique factorization domain). Notice that the only 
units here are +1. 
We're going to defer Z[,/—3] for now. 
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Example 128 


Now, let’s try Z[,/—5]. Does the same argument work to show that this is a principal ideal domain? 


If we try to make the same argument by starting with an @ € A with minimal length, we have a rectangle that is 
V/5a by a. But this time, our circles don't completely cover the rectangle, and we’re in trouble. 
Instead, the idea is to consider sak, If 6 € A Is close to Lev S ay, then 26 would be too close to (1 + /—5)a. 


So we also can’t be within $ of any of the half-lattice points. 


But we've missed one detail: @ can’t be close to Ly Sq, but it can be equal to it! So if we have an ideal 
A strictly larger than aR, where a is the point with minimal magnitude, then A must contain one of the points 
sar/—5, 5a(1 + V—5), $a(2 + V—5). 


This first option is impossible: if y = 4V —5a is in A, then /—5y = —3a isin A, meaning $0 is in A, contradicting 


the minimality of a. Similarly, y = $a(2 + /—5) isn't allowed, either. This yields our result: 


Proposition 129 
Let R = Z|,/—5], and let A be a nonzero ideal. If a is the nonzero element in A with smallest magnitude, then 


either A = aR is a prime ideal, or A = (a, 8)R, where 


B= ac +V-5)a. 


(Sometimes, G may not be in the ring, so we can't have (a, 8)F as an ideal in that case either.) 


Recall the issue with unique factorization in this ring that we explored a few days ago: if 6 = /—5, then 
6=2-3=(1+6)(1- 6). 
We took ideals to rescue unique factorization by defining the two ideals 
A=(2,1+6),B =(3,1+6). 


Let's review the logic there: to take the product ideal AB, notice that 


AB = (6,3 + 36,2 + 26, (1+6)?) 


has (1+6)R as a subideal, and all of the elements in AB are in the principal ideal (1+ 6)R. So AB is generated by 
1+06. 

From there, noting that A = (2,1—6) and B = (3,1—6), we found that AA = 2R, BB = 3R, and AB = (1—54)R. 
This meant that our expression 6 = 2-3 = (1+ 6)(1— 4) could be rewritten as 


(AA)(BB) = (AB)(AB). 


Fact 130 


Note the difference between a lattice with basis (2,1+6) (which is linear combinations with integer coefficients) 


and an ideal with generators (2,1 +6) (which is linear combinations with coefficients in the ring). 


We claim that A = (2,1+.6) has a lattice basis. To show this, we just need to show that 26 and (1+.6)6 are 
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integer combinations of 2,1+ 6, and indeed 


Mess 2 eb pS) — 8. 


Drawing the lattice in the complex plane, it looks like a sheared rectangular grid. We basically take a checkerboard 
pattern of Z[6], and this means that A is the second kind of ideal from Proposition 129 

On the other hand, B = (3,1+6) also has a lattice basis (this can be checked), and drawing out the ideal gives 
another sheared rectangular grid. 1+6 has smallest magnitude here, and again we have the second type of ideal. That 


means (2,1 +6) and (3, 1+) are actually similar figures. 


Fact 131 
We say that there are two ideal classes in Z[./—5]: the principal ideal and the (a,@)R example from above. In 


a principal ideal domain, there is only one ideal class, and it’s of the form aR. 


So now let’s try to do this procedure for Z[,/—3]. Most of the work here is in a homework problem, but the ideas 
are a little different: Z[/—3] is contained within Z[w] = $(—1 + V3/), so it’s better to use the ring Z[w]. 


16 March 15, 2019 


We'll continue our study of imaginary quadratic fields today. Let d be a squarefree negative integer, and let 62 = d. 
(For example, we can have d = —1, —2, —3, —5, and so on.) Consider the field F = Q[6]: all elements in F are of the 
form 

a+ bé,a,bE€Q. 


Definition 132 


An algebraic integer in a field F is an element that is the root of a monic polynomial with integer coefficients. 


Equivalently, an algebraic integer is an element whose (monic) minimal polynomial has integer coefficients. 


In particular, if @ = a+ bd, the minimal polynomial is 
(x — a)(x — @) = x? — (a +. @)x + (aa), 


so we must have a + @ = 2a and Ga = a* — bd be integers if we want a to be an algebraic integer. Working out 
the fussy details, we must have 

+ both a, b are integers, or 

*d=1mod4anda,beZ+ $ are both half-integers. 


We should be careful here, though: d = 1 mod 4 means d = —3, —7, —11,---, since d Is a negative integer. 


Proposition 133 


If an algebraic integer a is rational, then a € Z. 


Proof. Write a= = If a is an algebraic integer, the leading term x” contributes a & term, which has factors in the 


denominator that the other terms cannot remove unless gq = +1, in which case a Is an integer. 
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Fact 134 


In any field F = Q[6], the algebraic integers form a ring R, which is a lattice in C. 


Visually, if d 41 mod 4, then the lattice is a rectangular grid, and if d = 1 mod 4, we also have the midpoints of 
those rectangles in our lattice. In both cases, we do have a lattice basis: everything is an integer combination of 1 
and 6 in the rectangular case, and of 1 and $(1 +6) in the other case. 


Let's focus more on the simpler versions of R: assume d 4 1 mod 4, so we're in one of the cases 


d=-—1,-2,—5,—-6,-10,-:- 


In such rings, any nonzero ideal A is a sublattice of R with lattice basis a, 6, and we can write 
A=aZ+4+ BZ. 


(This is an idea from 18.701: we have a discrete subset of the plane with two independent vectors.) We also have the 


idea of “ideal generators:” elements a1,--- ,@x generate PR if we can write 
A=ayR+---+agR. 


Remember that given nonzero ideals A,B, the product ideal AB is the set of finite sums ae ajG;, where aj € 
A,B; € B. If a1,--+ ,a,% generate A and 6;,--- ,G~¢ generate B, then we know that {a,6;} generate AB (write any 
element as a linear combination). We also know that if A is the principal ideal aR, then AB = aRB = aB — we just 


take all the elements in B and multiply them by a. 


Fact 135 
For any ideals A, B, C, we have that AB = BA, (AB)C = A(BC), and A(B + C) = AB + AC (because the ring 


is commutative, associative, and has distributivity). Also, RA = A for any A. 


As we've already done in earlier examples, we can construct the set A = {@ | a € A}. This is an ideal, because 


(1) for any a€ A,p € R, pa = pa, and (2) pE R, because R= R. 


Proposition 136 (Main Lemma) 


Suppose 6 is a negative squarefree integer. If A is an ideal of R, the set of algebraic integers in Q[6], then AA is 


a principal ideal of R generated by a positive integer n € Z. 


Proof. Take a lattice basis (a, 8) for A. Then A is generated by @, 8, so AA is generated by the four elements 


aa, a6, Ba, BB. 


We can’t say that all four of these are rational, but we do know that @a, BG, and (@B + Ba) are rational (the last 
one because it is equal to its own conjugate), and because they are algebraic integers, they are in Z. Let n be their 
greatest common divisor: It is in the ring, because n is a linear combination of the elements above. We claim that 
AA =nR. One direction is easy to show: since n is generated by elements of AA, we have nR Cc AA. 

Now, our goal is to show that @a, @8, Ba, BB € R. The first and last are clear, because they are multiples of n by 


construction: we just need to show that @G and Ba are good as well. Alternatively, we can show that 


abBenR => eR. 


48 


And we can show that 7 = a8 is an algebraic integer by showing that y+ y and 7Yv¥ are both integers. We compute 
and find that 


= aB + a8 
Y+7 = —m. 
n 
and since we defined n to be a factor of @B + af, this is indeed an integer. Similarly, yy = zane = eae which is 


an integer, and we're done. 


Corollary 137 (Cancellation law) 


If we have ideals such that AB = AC, then B=C. 


Proof. Multiply by A. Then 


AAB = AAC => (nR)B=(nR)C = nB=nC, 


which means B = C. 


Corollary 138 


lf A> B, then there exists an ideal C such that B = AC. 


Proof. lf A> B, then AA = nR > AB, so everything in the product ideal AB is a multiple of n. Then let C = AB. 


this is closed under multiplication by R because B Is, so it’s an ideal. Then 


AB nRB 


as desired. 


Since every ideal is contained in a maximal ideal, this means we can factor ideals! 


Fact 139 


Instead of “maximal ideal,” we're going to think about “prime ideals,” but these are not generally equivalent. 


Definition 140 


Let R be any ring. An ideal P is a prime ideal if any of the following three are satisfied: 


* R/P is a domain. 
- If P is a proper ideal of R, a,b € R, and ab € P, then one of aor bis in P. 


- If P is a proper ideal of R, A, B are ideals, and AB C P, then AC Por BCP. 


Let's show that the second condition implies the third: If AB C P and A ¢ P, then there exists an a € A where 
aéP. But aB C AB C P (by the definition of an ideal), so for all b € B, ab € P. Thus by the second condition, 
b€ P, and that’s true for all be B, soBCP. 


Lemma 141 


The prime ideals of R (the ring of integers of Q[6] are the zero ideal (since R is a domain) and the maximal ideals. 


To show this, we'll need a subresult: 
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Lemma 142 


For any nonzero ideal A of R, |R/A| is finite. 


Proof. AA= nR, and R/(nR) has n? elements. Since nR C A, |R/A| < |R/nR| = n?. 


So to prove Lemma 141, note that if P is a nonzero prime ideal, then R/P is a finite domain by our sublemma, 


and therefore it is a field. And this means that P is indeed maximal by Corollary 84. 


17. March 18, 2019 


Let’s review a bit about lattices. A lattice is a subgroup of R? with a lattice basis a1, a2: we can write it in the form 
A= Zay + Za. 


Definition 143 


If B Cc Ais a sublattice, then the index [A : B] is the number of additive cosets of the form a+ B in A. 


We basically ask how many copies of B can be translated to cover A exactly once. An alternative way to think 
about this is to draw a minimal parallelogram of B: we only want to count one of the corners, half of the points on 
the boundaries (since the top/bottom and left/right are translations of B), and all the points in the middle. In other 


words, we want to find points ao € A satisfying 
Ap = 181 + rB2,0< n,m <1. 
Alternatively, if we have a lattice basis a1, @2 for A, we can write 6, and Go as integer combinations of a1, a2: 
Bi\ {Ff s\ [es 
@) CIE) 


Then | [A : B] = | det M| | another way to say this is that 


AB 
[A: B]= 7a: 


where AA is the area of the parallelogram spanned by the lattice basis in A. 


Corollary 144 
For lattices A> BD C, we have 


[A: C] =[A: BIB: C]. 


(We just multiply out the areas.) And this statement is true for indices in any groups, even if the orders are finite! 
Question 145. What are the shortest vectors in a lattice A? 


Sometimes, like in the Z? basis, the shortest vectors are just our lattice basis. However, if we have something like 
10 11 
Vy = 10e, + 9e, vo = 11e, + 10e€ (which is invertible since ( P *) has determinant 1). Then clearly the shortest 


vectors aren't going to come from the lattice basis. 
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For sake of simplicity, let's say the shortest vector a (that is, the one with minimal length) is horizontal. Then we 
have elements In our lattice --- ,-—a@,0,a,2a,---. 

We can't have another point that is within a circle of radius |a| of any point on the a-axis. So we have another 
point G in our lattice outside those circles: let's say GB forms a lattice basis with a. Then the area of the parallelogram 
formed by a, is AA = |alh, where fh is the height of @ to the axis, and that height fh is at least 3 \q| (at the 


intersection of our circles). Thus, 


AA> YBa 


Corollary 146 


For any lattice A, we can bound the length of the shortest vector: 


2 
al? < AA. 
el 


With this, let’s go back to quadratic number fields: say we're working with a negative square-free integer not 
congruent to 1 mod 4, so d = —1, —2,—5,—6,---. Then the ring of integers R is Q[6], where 62 = d, which can be 
written in the form 

{a+ b6:a,beZ}. 


Then (1,6) forms a lattice basis: let’s try to look at nonzero ideals in this ring R. Ideals are sublattices, and for any 
ideal A, 

R: A) = —~ = —. 

AR |d| 
Recall Proposition 136, the main lemma: for any ideal A, AA = nR, where n is a positive integer. This gives us the 
cancellation law: AB = AC => B=C, and for any AC B, there exists a C such that B = AC. 


Earlier, we also discussed the idea of a prime ideal — the following definitions are equivalent: 
- R/P is a domain. 

“lf abe P, thenae Por beP. 

- If A, B are ideals, then ABC P = ACPor BCP. 


The second and third points being equivalent comes essentially from comparing “subset” to “divides.” Luckily, in 
the rings R we're talking about, the prime ideals are maximal ideals. (Maximal ideals are always prime because R/M 
is a field, which is a domain.) In particular, we showed last time that R/P is a finite field with order [R : P]. 


We can use this to think about the idea of factoring into prime ideals: 


Proposition 147 


If A is a nonzero proper ideal of R = QJ[d], then we can write A = P,--- Px, where Ps are prime ideeals unique up 


to ordering 


Proof. The proof is the same as for the integers. If we want to factor A, choose a maximal prime ideal P that contains 
A — every proper ideal is contained in at least one. Then P divides A, so A= PB. 

We claim that B > A. It’s clear that B > A, because PB C RB = B. On the other hand, if we had A = B, we'd 
have A= PA RA = PA. But now by the cancellation law, we have R = P, which is a contradiction. Thus, 


A<B => [R:A]>[R:B], 
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and now we can just induct on the index! (The base case is when we have a maximal ideal, in which case we have a 
field with prime order.) 


And showing uniqueness is the same as in factoring of integers. If A= Pi---P/, then Pi > P,--- Py, so by the 


third property, P; must contain one of the Ps and vice versa. 


Our next result was initially proved just for Gaussian integers: 


Proposition 148 
Let P be a prime ideal of R. Then PP is either pR or p?R for a prime integer p. On the other hand, if p is a 


prime integer, then pR is a prime ideal or PP for some prime ideal P. 


The proof should look fairly similar as well. 


Proof. \f PP = nR, we can factor n = p;--- px in the integers. Then nR = (p,R)--+(pxR). This means 
PP = (piR)--- (PKR), 


and now we can factor the right side into prime ideals: since both sides must be the same by unique factorization, we 


have k = 1 or k =2. The first case gives pR, and the second case has P= p,R — > P= p,R as well, giving p?R. 


The second one is similar and not too hard. 


So the question now: which primes split, and which remain prime, if we're working in a ring Q[6]? Well, p remains 
prime if and only if pR is a prime ideal, which implies that pR is maximal, and therefore R/(pR) should be a field. To 
understand this, take the integer polynomials Z[x]. We can mod out from Z[x] + Z[6] with kernel x? — d, and then 
map Z[d] to R/pR with kernel p. On the other hand, we can map Z[x] to F,[x] first and then map it to R/pR with 


kernel x2 — d. That tells us our the answer: 


Proposition 149 


A prime p splits if and only if R/(pR) is a field, which happens if and only if x? — d is irreducible in Fp. 


Example 150 
Let p=5,d = —21. Does 5 remain prime or split in Q[,/—21]? 


Consider the polynomial x? + 21 mod 5: this is (x + 2)(x — 2), so 5 splits in Z[6]. Thus we have 
BR = PP; 


what do the ideals P, P look like? Notice that x — 2 is sent to 6 — 2 when we take Z[x] > Z[6], but we want x — 2 
to be in the kernel of the map from Z[x] > R/pR. So 6 — 2 should be in the kernel of R/pR for consistency, and 
therefore P = (5,6 — 2). Let's check this: we have P = (5,6 + 2), and 


PP = (25,56 + 10, 56 — 10, 6? — 4 = 25). 


We can write a linear combination of these that includes 5, so this ideal is indeed 5R, as desired. 
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Recall that if we have a negative squarefree integer d = 2,3 mod 4, we can look at the ring of algebraic integers of 
Q[6], which take the nice form Z[6] = {a+ bd | a,b € Z,6? = d}. In such a ring, we know that we can factor any 
nonzero ideal A uniquely up to order. 

We also know that for any integer prime p, we can write the ideal pR either as P (in which case p remains prime) 


or as PP (in which case p splits. We looked at d = —21 as an example last time by using the following property: 


Proposition 151 


p splits if and only if there exists an integer a such that a2 = d mod p. Then the ideal P = (p, a— 6) yields the 


factorization for PP = pR. 


Proof. We know that P > pR, since (p, a— 6) is strictly larger than p (a—6 is not a multiple of p). Computing PP, 
PP = (p,a—6)(p,a+ 6) = (p*, p(a + 6), p(a— 6), a — d). 


Now p divides all terms as long as a7 = d mod p. So pR D PP, and therefore pR divides PP. This means pR is 


PP times some other ideal, and because we know pR must split into PP or remain prime by Proposition 148, this is 


actually an equality. (And this tells us that P is a prime ideal.) 


Example 152 
Let’s consider the case p = 2,d = —21. We know that —21 = 1 mod 2, so 2 splits. Similarly, 3 splits because 


—21 =0 mod 3. Finally, 5 splits as well: —21 = 4 mod 5. 


Specifically, using the notation for principal ideals pR = (p), we know that 
P=(2,1—-6) =  (2)=PP, 
Q = (3,-6) = (3) = QQ, 
S=—(6,2=0) => (5) =S5. 


There is a special case: 


Fact 153 
It's possible that p splits into (p) = PP, and P = P. For example, with P above, (2,1 — 6) = (2,1+6), and 


similarly with Q, (3, —6) = (3,6). In these cases, p ramifies (So 2 and 3 ramify but 5 don’t). 


Then we have PP = P? = (p), where P = (p,a—6). Expanding out the product, 
P* = (p*, p(a — 6), (a—4)*). 


Since (a — 6)? = a* — 2a6 + d, and this is supposed to be an element of (p), we must have p divide 2a and a? + d if 


the prime p ramifies. By construction, pla? — d, so this is equivalent to saying that p divides 2a and p divides 2d. 


Corollary 154 


A prime p ramifies in Z[,/—d] (where d = 2,3 mod 4) if p = 2 or pld. 
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In both of these cases, p does divide 2a: just use (2,1 — 6) for p = 2 and (p, —6) in the other. 


Definition 155 


The norm of an element a € R Is 


N(a) = aa = |al?, 


and this is always an integer. 


Norm is multiplicative: N(a@B) = N(a)N(B) for all aw, 8. The main lemma says that if A is an ideal of R, then AA= nR 
for a positive integer n. So we can define the norm of A to be n: notice that this preserves the multiplicative property 
N(AB) = N(A)N(B) for ideals. 


Theorem 156 


Let A, B,C be ideals of R, and BD C. Then [B: C] = [AB: AC], and N(A) =[R: A]. 


This is obvious for a principal ideal: [A : pA] = p* by drawing a p by p grid. But it’s a bit harder to prove this in 


general: 


Proof. |t's enough to prove the first statement for A = P, a prime ideal, since we can always write out A = P,P>--- 
and then successively apply the result for prime ideals one at a time by induction. 
We know that B contains C, PB contains PC, B contains PB, and C contains PC. Since [A: C] =[A: B][B: C], 


[B: PC] =[B: C][C : PC] =[B: PBI[PB: PC], 


and now it’s enough to show that [B : PB] = [C : PC] — we'll do this by computing directly for each of the possible 
forms of the prime ideal P. 
If P = pR, then [B : PB] = p?, and [C : PC] = p? as well. Otherwise, pR = PP, and then 


B>PB>PPB=pB, 


and this means (by the product rule again) that p* = [B : PB|[PB : PPB]. Those must be both p (unless one of them 
is 1, which is not the case by cancellation law). So then [B : PB] = p, and the same argument works for [C : PC] as 
well. Thus we've shown in all cases that [B : PB] = [C : PC], and thus [B : C] =|[PB: PC] as desired. 

For the second statement, let’s factor A = PQ, where P is a prime ideal and Q is some (not necessarily prime) 
ideal. We know that N(A) = N(P)N(Q), and A > PA D> PQA. Looking at the index of A in R, we know that 
[A : PA] is N(P) by the arguments above, and then 


[R: A] = [A: PA][PA: PQA] = N(P)N(Q) 


since [PA : PQA] = [R : Q] (by using the cancellation law from the first part of this theorem twice), and by induction 
we know that [R : Q] = N(Q). Finally, since norm is multiplicative, this is just N(PQ) = N(A), as desired. 


Recall Corollary 146, which says that if A is an ideal and @ is a shortest nonzero vector in A, then |x|? < BAA. 
From the above theorem, we have that [R : A] = we we also know that AR = ,/|d| in the lattices we're dealing 


with. So if we plug in all of the results that we have, 
d |) 
lalr= ay Sie ‘A= 2y/ linea, 
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This motivates the definition of the constant | z= 24/ — |. 


Fact 157 


The value of u is different for p = 1 mod 4, but we're not dealing with that case right now. 


Let’s define an equivalence relation on ideals: A & A’ if A’ = A for some complex A € C. Basically, this means 
the two lattices are similar geometric figures, except that we can't change orientation of the lattice. Note that all 


principal ideals are equivalent: R~ A if A= AR. 


Definition 158 
An ideal class is an equivalence class of ideals (where A ~ A’ if A’ = AA for some  € C). Let the class of A be 
denoted (A). 


Theorem 159 
rel 


Every ideal class contains an ideal / with norm N(/) < p = 24/3. 


In other words, we don't have to look at very small ideals with large lattices: we can just look within some bound. 


Proof. Start with any ideal A. Choose an @ with N(a) < N(A): this exists by the calculations we were doing above. 
Note that A contains the principal ideal (a), so A divides @ and we can write (a) = AB for some B. Taking norms 
of both sides, 

N(a) = N(A)N(B) < UNA) 


so we have N(B) < w. We're not quite done here, but we'll show that B is in the class of A and has the same norm 


as B next time. 


Example 160 


For d=—21, w= 2/7 <6. So we only have to look at ideals with norm less than 6 to find all the ideal classes. 


19 March 22, 2019 


We're going to say some more about ideal classes today. As before, we're still working in the ring {a+ bd | a,b € Z} 
for some 6° = d, where d is a negative squarefree integer congruent to 2 or 3 mod 4. Recall that an ideal class is an 
equivalence class of ideals: A ~ A’ if A’ = XA for some > € C. Our goal is to make these ideal classes form a group. 
Denote the class of A as (A). We can define a law of composition via (A)(B) = (AB); this is commutative 
and associative because multiplication in the ring is commutative and associative. This law of composition is indeed 
consistent: if A~ A’, B~ B’, then AB ~ A’B’, since we can write A’ = XA, B’ = »0'B AB’ = dd'AB. 
Note that the identity element here is the whole ring: denote this as (R) = 1. Then an ideal A € (RZ if and only 


if A= XR: this means A is a principal ideal. Furthermore, (A)~' = (A), because AA = nR for some integer n € Z, 
and therefore (A)(A) = ((n)) = 1. This means the ideal classes form an abelian group called the ideal class group! 
We'll denote this as C. 
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Example 161 


C is the trivial group if every ideal is principal: in these cases, R has unique factorization, since all PIDs (principal 


ideal domains) are UFDs (unique factorization domains). 


We call |C| the class number: it measures the “failure” of unique factorization. 
It's surprisingly easy to compute the class number! Remember that we can define the constant wu = 2 Jl and 
we can apply Theorem 159, which says that every ideal class (A) contains an ideal A’ with norm at most w. 

As a quick review, the norm of an ideal is defined to be N(A) = n if AA = nR. This is also the index [R : Al, 
which we can measure by taking a lattice basis and computing a. In our ring R = {a+ bé}, a lattice basis is a by 6, 
so AR= Jd (the area of the rectangle formed by 1 and 6). And this norm is useful because it is multiplicative and 


always an integer! 


Continuation of proof of Theorem 159. As before, if we're given an ideal A, let a be the shortest nonzero vector in 
A. We have 


N(a) = |a|? < GAA 


and we know a@ € A, so the principal ideal generated by a@ is a subset of A. Inclusion of ideals gives us divisibility: 


therefore, A divides (a), and we can write (a) = AB for some ideal B. Now notice that (working in the ideal class 
group) 1 = ((a@)) = (A)(B), and N(a) = N(A)N(B) as well: this second fact means 


2 2 
N(A)N(B) = N(a) < —AA = —AR.- N(A), 
(A)N(B) = Na) < TBA = ZAR N(A) 
and therefore we have N(B) < FAR = =\/|d| = py. But now note that (B) = (A)~1, so (B) = (A), so B is in the 


same ideal class of A. And since N(B) = N(B), we're done: we've proven that A is in the same ideal class as an ideal 


B with norm at most p. 


So now we can just look at all ideals with norm less than mu: there’s only a finite number of them! We could 
brute-force our way through them, but here’s a better way: we can write A = P,--- Px as a product of prime ideals 
P;. Norm is multiplicative, so 

N(A) = N(P1)-++ NCP), 


where N(P;) < pu for all /. That means C is generated by prime ideals (P), where N(P) < pw as well! 


Why is it easier to work with prime ideals? We know that 
N(P) = p or ?, 


where p is a prime integer, corresponding to p splitting or remaining prime. If p remains prime, then the ideal P is just 
generated by p, which is a principal ideal: those are in the same class as R, so we can ignore those! This has led us 


to the following result: 


Proposition 162 


The ideal class of Z[5] is generated by the prime ideals P where PP = pR, p < u, and p splits. 


Example 163 
Let’s go back to d = —21: what's the ideal class group? 
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We have « = 2/7 < 6, so the class group is generated by the prime ideals that are factors of (2), (3), (5). These 
principal ideals all split: 2 always ramifies for p # 1 mod 4, so (2) = P?. 3 ramifies as Q? as well (recall that this is 
because 3 divides d). Finally, 5 splits as SS. 

So what are our relations for the ideal class group? We have (P)* = 1 and (Q)? = 1. S is a bit harder, but we 


have a secret method: compute some norms of elements! Note that 
N(1+6) = (14+ 6)(1 —6) = 22 =2-11, 


and this is an equation among elements, but we can take the principal ideals generated by the left and right side 


instead. So we can rewrite the above equation as 
(1+ 6)(1 — 6) = (22) = (2)(11) 


where this equation now refers to ideals. To factor this equation, note that (1+6) = P,---P, = > (1-6) = P,--- Px, 
and those must be the same prime ideals that divide the right side. (2) factors as P* from above, and since we have 
an even number of prime ideals on the left side, we must write the right side as P? - TT for some prime ideal T (in 
other words, 11 splits, and x2 + 21 is reducible mod 11). 

But then that means k = 2, and now we can’t have P and P be part of the factorization of the same ideal on the 
left side: after all, the ideal generated by 2 and by 1+ 6 are different. So this means that 


(1+6)=PT, (1-6) =PT. 


That wasn't particularly helpful, so let’s try looking at the norm of 2+: it is 2—6? = 25, so looking at the ideals, 
we have 
(2+ 6)(2 — 6) = (5)(5) = SSSS. 


so now we must have either (2+ 6) = SS, S?, or ise it’s not the first one because the ideal generated by 5 is not the 
ideal generated by 2+6. Thus (2+6) = S? (or S without loss of generality), and this means S? is a principal ideal! 


Therefore (S)* = 1 in our class group. 


So now we know that (P)? = (Q)? = (S)? = 1, so our ideal class group has order at most 8. But there might be 
other relations as well! 
To figure those out, let’s compute N(3 + 6): it is (3 +6)(3 —6) = 30 =2-3-5, so 


(3 + 6)(3 — 6) = (2)(3)(5) = P?Q7SS. 


The left side factors into prime ideals that must be conjugates of each other, so we must have (3+ 6) = PQS or 


PQS. But (S)* = 1, so the classes (S) = (S), and this gives us the relation 
(P){Q)(S) = 1. 


This is another relation, and we can use it to eliminate S from our list of generators! We know that (P) and (Q) 
are not principal ideals (in particular, P = (2,1 —6),Q = (3,6)), so both ideal classes are not trivial. So our group 


contains 1, (P), (Q), (PQ), and the only remaining possible relation is 
(P)(Q) =1, 


which is not true, since the ideal has norm N(PQ) = N(P)N(Q) = 2-3 = 6. Then PQ = (a) would have to be 


generated by some element with norm 6, and none exist! Thus we've found our class group: |C = Cs x Co}. The idea 
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is that we can always try this kind of drawn-out calculation, and it'll work. 


Example 164 


Now let's look at d = —26: we find that w < 6 again, so we again need to only need to look at 2,3, 5. 


As always, 2 ramifies as (2) = P?. For 3, note that —26 is 1 mod 3, which is a square, and therefore 3 splits. 
Finally, —26 is 4 mod 5, which is a square, so 5 splits as well. Once again, we have (P\? = 1, and we want to use 
norms to find the rest. 

Some potential candidates here are N(1 +6) = 27 = 3°, N(2+6) = 30 = 2-3-5, and we could also try 
N(3 +6) = 35 and N(4+6) = 42 (which are less useful). Looking at ideals, 


(1+6)(1—5) = (3) =Q°Q3, 


and 

(2+ 6)(2 — 6) = (2)(3)(5) = P?QQSS. 
So (1 +6) uses three of the Qs and Qs, which means it’s one of Q?, Q?Q = 3Q, clay = 3Q, or Co. It can’t be 3Q 
or Q, because (1+.6) is not a multiple of 3, and then it doesn’t matter which of Q and Q we choose — either way, we 
find that (Q)? = 1, since (1+ 6) is principal. 


The next step is to show similarly that 


(2+6) = PQ**S*?, 


and that means (P)(Q)*!(S)*1! = 1, meaning we can always solve to eliminate S. So the ideal class group of Z[,/—26] 
is generated by (P)? = 1 and (Q)? = 1, and thus|C = C¢ |is the cyclic group of order 6. 
Note that it is true in general that (S)* = 1 = (S) = (S), but this does not mean S and S are the same. We 


only know that S = AS: what is A here? Let’s take d = —21 as an example: we can use the fact that 
(S5)5 (515 =55, 


but also 
S(SS) = S(2+6). 
So. 5555 


taking norms. 


S — notice that the absolute value of the constant term is 1 here, which we could have also found by 
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Let's talk about abelian groups today: we'll be using additive notation. For example, we can look at a lattice in Z” or 
the cyclic group Z/nZ for a positive integer n. We'll carry over the description of a basis from that for a vector space. 


So V is an abelian group: to describe some element (v1,--- , Vz) € V, we have a map 
V:Z" OV, 


which sends an n-tuple of integers x into an element of our vector space via the map Vx = )> v;x;. In other words, we 
have a homomorphism compatible with the group operations. Then V is independent if it is an injective map, and V 
generates (or spans) V if it is surjective. And as usual, V is a basis of our vector space if it is bijective. 


Abelian groups may not have a basis: for example, we can write all elements of Z/nZ in many different ways. But 
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if V has a basis, then it is isomorphic to Z": then V is a free abelian group. (We'll get back to the case where we 
don't have a basis later.) 


So now, let V be a free abelian group with basis V = (y,--- , V;). Let's try to develop some theory: 


» We can do a base change: if we have a new basis V’ = (vj,--- , v,), we can write 


n 

— 

veo ) Vi aij 
i=1 


for some integers pjj. So V' = VP for some matrix P, and this means that for any vector v € V, we can write it 


in two different ways with coordinate vectors: 


v= vx = v’x’ Px’ =x. 


Since pj are integers, now P can be any invertible matrix with entries in Z: P € GL,(Z). But from the first 


18.701 assignment, this can only happen if the determinant is +1, and this basically tells us everything we need 


about basechange matrices for now. 


+ Consider some homomorphism @ : V — W of free abelian groups. This is just like a matrix of a linear 
transformation: if V = (v,---,Vp) is a basis of V, and W = (w,--: ,Wm) is a basis of W, then each basis 


vector 
bv) = 5 Wi aij 
for some integers ajj. So now 


(¢(1), —<— :P(Vn)) — (m4, mak 1Wm)A 


for some matrix A with integer entries. If we change bases for our vector spaces, so that V’ = VP and W’ = WQ, 


then P is an nx n matrix and Q is an m X m matrix: our new homomorphism matrix is now 
A’ = Q71AP. 


Indeed, this is the same change-of-basis formula that we're used to. 


Time to move on to something slightly interesting: how simple can we make A’? Remember that in a vector 


space over a field F, for any m x n matrix A with entries in F, there exist invertible matrices P,Q in F such that A’ 


wa{! ° 
0 Oo}. 


In other words, each A’ takes some basis elements to other ones and throws away other ones. But we’re working in 


has the block form 


Z, which is not a field — it turns out we can still get something analogous! 


Theorem 165 
Let A be an m x rn matrix with entries in Z. Then there exist invertible matrices P and Q such that Q7!AP is 


diagonal: 


A=Q'AP = Beis 
On oy 


where D is diagonal with integer entries dy|d2|d3---|dk. 


(The fact that djs divide each other is often not that important, but diagonalizing is!) Let’s do a proof by example. 
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Example 166 


4 8 
Take A= ( a and let's try to simplify this matrix with a change of basis. 


Here d; = 2 is going to be the greatest common divisor of the entries. How are we finding P and Q? Note that 


3) )- o)-6 2) 3) 


generate the whole group (this was also an 18.701 homework problem). So we'll just row reduce until our matrix looks 


our row-reducing matrices 


as simple as possible: 


(i)? dl yG y-6)-6 4) 


(We won't actually prove the result in more detail, but we can write out a systematic way to do this.) 
Now that we have a simple form for our matrix, we can say that A’ operates on our vector space by sending our 


basis elements in V to simple elements of W: 
Vi diwi, Vo —> d>W>, vee, Ve dk We, Vet1— 0; Vk42—7 O,---. 


2 0 ‘ : ; 
For example, the image of ata from Z? — Z? sends the regular integer lattice to a dilated rectangular one! Since 


the area of the smallest parallelogram is 8, the index of this map Is also 8. 
So now let’s shift back into looking at the case where we don't have a basis. Let our abelian group be Kk, and let’s 


say we have some elements (ki,--- , Km) which generate K. Then we have a surjective map 
k:Z™ > K. 


Let L be the kernel of k. By the first isomorphism theorem, K is isomorphic to the quotient group Z'"/L. 


Fact 167 


There is a second and third isomorphism theorem, but they aren’t very interesting. 


Let's choose some generators for our kernel L: we'll use the following useful fact. 


Proposition 168 


Every subgroup of a finitely generated abelian group Is finitely generated. 


This is basically because every element in our subgroup can be written as a combination of the generators of the 
original group, and we can't require an infinite set of generators! (We'll speak about this in a bit more detail later on, 
but the words to keep in mind are that a finitely generated abelian group is a finitely generated module over Z.) 

So L can be finitely generated by some (4, £2,--- ,£n,), so that our map 2: Z" — L is surjective. By definition, 
L is the kernel of our homomorphism from Z™ to K, so L is some subset of Z'”. Inclusion is a homomorphism as 
well, so we get a homomorphism from Z" to Z™, which is described as multiplication by an integer matrix! L can be 
written symbolically here as AZ”, and now our finitely generated abelian group K = Z™/L from above is isomorphic 
to Z™/AZ". 
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Remember that we can change bases for Z” and Z', so we can suppose that A is the matrix with block form 
D 0 
(| from the theorem above. Multiplying out AZ”, we find that the additional relations on our generators look 


like dye, = 0, doe =0,--- , dee, = 0. 


So in our example above, 


2 0 
K =Z?/ (: ‘ Z? = Z/2Z x Z/4Z. 


Theorem 169 (Basis Theorem for Abelian Groups) 


Every finitely generated abelian group can be written as 


ZK x 2/42), 


f=1 


where k is a nonnegative integer and all d; are positive integers. 


So we've now arrived at a pretty important result: all finitely generated abelian groups are isomorphic to the 


product of (possibly infinite) cyclic groups! 


21 April 3, 2019 


The second quiz for this class is next Wednesday, April 10 — it will cover chapters 12 and 13. 


We'll start a new topic today: 


Definition 170 
Given a ring R, an R-module V is a set with two laws: addition V x V — V and scalar multiplication Rx V > V. 


V must be an abelian group under addition, and we also have the following axioms: 
* a(bv) = (ab)v for a,b€ R,v EV, and lv=v. 


+ Distributivity holds: a(v, + v2) = avy + avo, and (a; + ao)v = av t+ aov. 


Except in very simple cases for our ring R, modules can look very complicated. 


Example 171 
What does a Z-module look like? 


We need to specify how to add vectors in our abelian group V, and we also need to know how to multiply by 
scalars. But we already know that lv = v, and then 2v = (1+1)v = v+v and so on, so this is just an abelian 


group — the scalars Z don't do anything for us. 


Example 172 


Let R = F[t], where F is a field. What does an R-module look like? 


We know that V is an F-vector space, if we just forget about the ts and look at the constant polynomials. In 


addition to that, we can also multiply by our “scalar” t to get a new element tv for any v € V. So define some linear 
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operator T:V > V 
T(v) = tv. 


It is true that T(vy + vo) = t(v + vo) = ty, t+ tvs =T(w)+T (v2), and T(cv) = t(cv) = (tc)v = (ct)v = c(tv) 


cT(v), so this is indeed a linear operator. So an R-module gives us a linear operator, as well as a vector space. 


Conversely, if we have an F-vector space V and a linear operator T : V > V, we can make V into an R-module 
(where R = F[t]) with tv = T(v), t?v = T(T(v)), and so on. So there's a correspondence between F[t]-modules 


and F-vector spaces with linear operators! 


Example 173 


What is an R-module, where R = F[x, y]? 


Then R-modules correspond to an F-vector space with two linear operators: one that sends v to xv, and one that 
sends v to yv. But remember that xy = yx, so our operators need to commute! In other words, the R-modules are 


in one-to-one correspondence with F-vector spaces with two linear operators X, Y satisfying XY = YX. 


Example 174 


What is a Z[/]-module? 


The Gaussian integers contain the integers, so we have our usual abelian group V if we ignore /. Now we have to 
think about what it means to multiply by / in our abelian group: it has to be some homomorphism / : V — V from 


the group to itself, but i? = —1. So /(v) = iv = /? = —id. For example, if our set V is Z?, we can have / be the 


—1 
2 x 2 matrix 


Let's (again) go through the logic for the basis calculations — this should look familiar from earlier in the class. 


Suppose we have an R-module V, and let’s say we have some elements V = (V1, V2,--+ , Vm). We have a homomorphism 
V:R™3V 


sending x + Vx = > yjx;, and this is a homomorphism of R-modules. Let's rigorize what a homomorphism means 


here: 


Definition 175 


A homomorphism @ : W — V of R-modules is a map that is compatible with the operations: 6(w, + wo) = 


b(m1) + b(We), (rw) = rb(w). 


We'll have the same notions of kernel and image as in the usual group and ring homomorphisms — we can say that 
V, our homomorphism, is independent if our map Is injective, that it generates V if V is surjective, and that it is a 


basis if it is bijective. It’s harder to satisfy all of these, now that we have extra structure on our abelian group! 


Definition 176 


An R-module V ts finitely generated if there exists a v that generates V, and V is a free module if there exists 


a basis V. 


Since V is a map from R™ to V, free modules V are isomorphic to R’”. So let's only say that V is finitely generated, 


and suppose V : R™ -> V is our surjective map. Letting W = kerV, we know that V is isomorphic to R/W. 
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Question 177. Is W finitely generated? 


We're going to defer this to later, but let’s assume that W is indeed finitely generated. Then we can say those 
generators are W = (W1,--- , Wp), and these generators give us a map W: R? + WC R™ sending x to Wx, and this 
is described by an R-matrix A (which has dimensions m x n). 

Now W is the image of W, which is AR”, so our R-module is isomorphic to R"/W = R'™/(AR"). In this case, we 


say that A presents the module V (because it tells us the relations). 
Example 178 

+1 ¢ 

t+1 +t 


Let’s say R= F[t], and A= ( 


| What does our R-module look like? 


We have m = n = 2, and R? has basis e;, €, so V is generated by the images v;, vo of €,, €> under some map. 


But we also know something else about v, and v2: the image of A is generated (with coefficients in F[t]) by the two 


t?+1 t 
ig ; € R? = Fle]. 
t+1 Pat 


We want to mod out by AR” in V (this is the kernel of the map V mentioned above), so we have the relations 
(t? ++1)vy + (t+1)vo =0, ty, + (t? + t)vo = 0 that define AR?. This is not a particularly nice looking set. 


column vectors 


But R = F[t] is very simple: we can make our matrix A look a bit simpler! Remember that we can do division 
by remainder for Z-matrices A: we basically use the Euclidean algorithm by looking only at the first column, and then 


rinse and repeat. Does something work here as well? 


Theorem 179 
Let R = F[t], and let A be an mx n R-matrix. Then there exist invertible R-matrices P,Q such that A’ = Q-1AP 


is diagonal with entries dy|d2|---|dg. 


The proof is very similar: let’s do another proof by example. We row-reduce to find that 


t?+1 t x 1 t - 1 0 - 1 0 x 1 0 
t+1 t24+t t+1—--—t? t241 t+1—-8-2? 442 « 4448 0 f4+e8)° 


i 0 


2 fe opp 
oe .) ? Here V is generated by two elements v;, v5 (since 


So what's the module with presentation matrix A’ = ( 


we changed a basis) and our relations are 
lv, + Ovs = 0, Ov, + (t* 4+ t)v4 =0. 


So V is generated by a single element v’ with (t* + t?)v’ = 0. Remember that an R-module for R = F[t] is a 


vector space with a linear operator: to find our vector space, we have elements of the form cv, c € F, and then V is 


generated by {v, tv, t?v, tev} (since Cvs —t3y). Calling those elements v1, v2, v3, V4, we have a linear operator 
000 0 
a T T T => T ee 
Vi = Vo, 1 Vo = V3, 1 V3 = V4, V4 = —V. = 
1 2 2 3 3 ‘4 ‘a ‘4 010 0 
001 -1 


and now the combination of our abelian group V and linear operator T is the R-module we're looking for. 
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22 April 5, 2019 


A correction: the quiz is actually on Monday. Recall that if we have an R-module V, and V = (y,--- , Vp) is a set of 


elements of V, then V generates V if every v € V Is a linear combination 


v= >. vixi, Xi € R, 


or if there exists a surjective map 
ViR™5V 


sending x to Vx = )> yjx;. Today, we'll discuss finitely generated modules, which are modules where the finite set V 


exists. 


Proposition 180 
Let 6: V > W bea surjective homomorphism of R-modules, and let ker @ = U. If U and W are finitely generated, 


so is V, and if V is finitely generated, so is W. 


It turns out U is not necessarily always finitely generated for modules in general. 


Proof. Let's first prove the second statement. If some set V = (v1,--- , Vm) generate V, let W = (w1,--- , Wm) be 
the images of the v;s: we'll show that these generate W. Since we have a surjective homomorphism, for any w € W, 
there exists a v € V such that $(v) = w. Now v = > xv; for some x; € R, and therefore w = >> x;wj is a linear 
combination of the elements of W, as desired. 

Now, let’s prove the first statement. Let U = (uy,---,Un,) generate U, and W = (w4,--- ,Wm) generate W. 
Since we have a surjective homomorphism, we can let v; € V be the elements such that o(v;) = w;: we claim that 
(V1,°°* Vm, U1,°** Un) generate V. To show this, take any element v € V and let w = ¢(v). Then w = >> wa; for 
some a; € R, since the wis generate W. Now if we define v’ = >> vja;, O(v’) = w = O(v), so v — v’ must be an 


element of ker @. Therefore v — v’ is a linear combination of the ujs, meaning we've written v as a linear combination 


of vjs and ujs, as desired. 


Next, let’s ask a new question: what are submodules of R if we treat R as an R-module? We need to have closure 


under addition and scalar multiplication by elements of R, so this is just an ideal! 


Definition 181 


A ring R is Noetherian if every ideal is finitely generated. 


This means that we can always find a finite set of elements in the ideal such that every element of the ideal is an 
R-linear combination of those elements. (And this is the missing ingredient we need to answer , because the kernel of 


a homomorphism is an ideal.) 


Example 182 


Z and F[t] (for a field F) are Noetherian rings, because they are principal ideal domains — all ideals are just 


generated by one element. 
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Theorem 183 (Hilbert Basis Theorem, version 2) 


If R is a Noetherian ring, then R[x] is Noetherian. 


We can substitute this into itself a bunch of times, too! By induction, any polynomial ring R[x1,--- ,Xpl is 
Noetherian if R is Noetherian: in particular, F[x,,--- ,X,] and Z[x,,--- , X,] are Noetherian. 

There’s no bound on the number of generators of such a polynomial ring, though! (We just know that it’s always 
finite.) For example, let /y C F[x, y] be the polynomials in two variables with no terms of degree less than d. This 
is an ideal because it is closed under addition within /p and multiplication by any polynomial, and now notice that 


5 


(for example) /s is generated by x°, x*y, x3y?, x2y?, xy*, y®. But in general /g needs d +1 elements to be generated, 


which is unbounded. 


Proof. Let | be an ideal of R[x] for a Noetherian ring R. For any polynomial f(x) = ax +.---, define a to be the 


leading coefficient of f. 


Lemma 184 


Let A be the set of leading coefficients of polynomials f € /, plus 0. Then A is an ideal of R. 


Proof of lemma. Closure is clear for 0: a+0 = aand a:0 = 0, which are both in the ideal for sure. If a, b € / are both 
nonzero, then there exist polynomials f = ax +--- and g = bx" +--- in /: without loss of generality, let m <n, 
and now 

x" F(x) + g(x) = (at b)x" +--+. 


If a+ b=0, O's already in A; otherwise, a+ b is the leading coefficient of a polynomial in /, so we have closure under 


addition. Similarly, if a is the leading coefficient of f(x) = ax™+---, rais the leading coefficient of rf(x) = rax™-+--- 


unless it is zero (which is already in A). 


So now because A is an ideal, and its elements are in R (a Noetherian ring), A is generated by (a1,--- , ax), where 
each a; Is the leading coefficient of some polynomial f € / for each /. Multiply all the fs by powers of x so that the 
degrees are all the same: let's say this degree is m. 

Now our fs don’t quite generate our set, but we can still do things with this new idea! We're going to do something 


similar to the division algorithm — for any polynomial g € / which is not equal to 0, we can write it as 
g = bx” + Seg 


Since b is a leading coefficient, it is in A, and thus we can write b = >> ajr; for some r; € R. Now if n > m, we know 


that 
k 


h= Sox ar 


i=1 

is an element of / with degree equal to n and leading coefficient equal to that of g, because x”~'"f; is a polynomial of 
leading coefficient a; and degree n by construction. This means that the difference (g — h) has degree less than n. 

We can repeat this arbitrarily until g has degree less than m: the idea is that all polynomials g in our ideal / are 

a linear combination of our fs, except potentially with a remainder of degree less than m. So the division algorithm 

gives us a combination 5> pif; in the ideal generated by (f,,--- , &), and now we just need to finitely generate elements 


of the form g — >> p;f;: these are some polynomials of degree less than m. 
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Lemma 185 


If R is a Noetherian ring and V is a finitely generated R-module, then every submodule of V is finitely generated. 


Proof of lemma. If V is a finitely generated R-module and U is a submodule, we have surjective maps from R™” > V 
and U’ + U, where U’ is a subset of R. Since these are surjective homomorphisms, it is enough to show that every 
submodule of R” is finitely generated. 

We'll use induction. The base case m = 1 Is true by definition of a Noetherian ring. Now suppose V = R™, 
and let m be the homomorphism from R” to R~! that drops the last component. If we have our surjective map 
mw’: U— U, where UC R™ and UC R™!, we know that U is finitely generated by induction, and now the kernel K 


is a subset of ker a. ker 7m is isomorphic to R (because we just drop the last element), which is a Noetherian ring, so 


K must be finitely generated by because it’s an ideal! Thus U and therefore U are finitely generated. 


So now, the polynomials of degree less than m, plus the zero polynomial, form a free R-module with basis 
(1,x,---,x-+). 1 Pp», which is the polynomials with degree less than m in / (plus zero) is a submodule, and 


thus it is finitely generated by some set (g1,--- , gg), and now that set plus our original polynomials f,,--- , f generate 


I, as desired, finishing our proof. 


Fact 186 


Hilbert proved many things in 1895. The paper’s in German, though... 


23 April 10, 2019 


(We got more cookies in class.) We're going to finish talking about modules today and move on to fields on Friday. 
Recall that in a finitely-generated R-module V, there exist elements (v1,--- , v,) such that every element v € V can 
be written as a linear combination S> rjvj. We found that a submodule of R (as a module) is just an ideal, and we 
defined the notion of a Noetherian ring to be one where every ideal is finitely generated. This led us to the Hilbert 
Basis Theorem, which says that R[x] is Noetherian for any Noetherian ring R. Applying this repeatedly, we know that 


F[x1,--++ , Xn] is always Noetherian for any field F, and so is Z[x1,--- , Xp]. 


Proposition 187 


If we have a surjective homomorphism @: R — R’, and R is a Noetherian ring, then so is R’. 


So any ring that is a quotient of a polynomial ring over a field is also Noetherian — this is basically all of the rings 


we've been looking at so far! Note that here our ring R’ = R/ker ¢. 


Proof. If I’ is an ideal of R’, consider the inverse image @-1(/’), which is an ideal of R. Since R is Noetherian, / is 
finitely generated by some elements (v1,--- , Vx), and all w in / are combinations of the form w = > rvj. So now 
the images of the vj; will generate /’: if w’ € /’, we can choose w € / such that ¢(w) = w’ by surjectivity, and then 


w= nye => w => rv} (here r} = O(r7))), and thus the v/ = o(v;) generate /’. Therefore every ideal of R’ is 


finitely generated, and thus R’ is Noetherian. 


So to find a non-Noetherian ring, we need to add infinitely many elements, and we'll do that now: 
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Example 188 


Take R to be the ring R = F[xz, Xo,---]. 


Any particular polynomial only uses finitely many monomials (because it’s a finite combination of monomials), but 
the ring can contain infinitely many monomials. 

This is not Noetherian: let / be the set of polynomials that evaluate to O at 0. Equivalently, we can say that this 
is the set of polynomials with no constant term. But the ideal contains x,, X2,---, and it can’t be finitely generated 
because we have infinitely many variables and each generator can only use finitely many of them. 

So now let V be a finitely generated R-module, where R is a Noetherian ring. Remember that if V = (v1,--- , Vm) 
generate V, we have a homomorphism ¢@: R™ — V sending a vector x — Vx. We can get every vector in V (because 
this map is surjective by definition of V generating V). Letting W = ker ¢, W is now a submodule of R™, so it is 
finitely generated as well (by Lemma 185). Basically, it’s enough to show that a submodule of the free module is 
finitely generated, and we use induction. 

So now if we choose our generators W = (w1,--+ , Wy) and consider our surjective homomorphism R” > WC R™, 
we can compose the map R” — W and the inclusion W —- R™: then this is represented by an m x n matrix with 
entries in R. So now W = AR", and by the first isomorphism theorem, V is isomorphic to R™/ker@ = R™/W, and 


thus our finitely generated R-module looks like 
V=R™/(AR"). 


We refer to this by saying that A presents V: sometimes we can simplify our matrix A quite a bit, especially if we 
have the division algorithm (in which case we can diagonalize with base-change matrices). Looking more in detail at 
the basechange matrices, recall that we can replace A > Q-1AP, where Q is an invertible m x m R-matrix and P 
is an invertible n x n R-matrix. Specifically, if R = Z or R = F[t], we can make our matrix Q-!AP diagonal. This 
led us to the Basis Theorem for abelian groups: every finitely generated abelian group is the product of (possibly 
infinite) cyclic groups. 

Also recall that if we have a ring R = F[t], an R-module V is just an F-vector space with a linear operator 
T :V + V, where we send T(v) = tv (so that we know what happens to t, t?, and so on). So let Abe an mx n 


matrix with entries in R, and let’s say we can make it diagonal: we write 


a=(5 a 


where D has k nonzero diagonal entries f{|f|--- that successively divide into each other. So now if we present V 
as R™/(AR"), let's think about what V looks like: we have generators V = (v1,--+ ,Vm), and now each column 
vector gives us a restriction on our generators. Since each column vector only contains one element (because we 
diagonalized), we have 


VHVx- XVUX Vay Ke kK Vi, 
where V; = R/(ffR) for all 1 <i < k, and Ve4i,+-+ , Vm & PR (there are no relations). 


Definition 189 


A cyclic R-module V is one that is generated by a single element. 


In such a module, we have a map R — V which sends 1 -— v: this is a surjective homomorphism with some kernel 


/, so by the first isomorphism theorem, V = R/1. 
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In the case where R = F[t] for a field F, we have | = fR for some f € F[t] (because we have a principal ideal 
domain), so we must have V = R/(fR). 


Theorem 190 (Structure theorem for modules over a polynomial ring) 


Let R = F[t] for a field F. Then every finitely generated R-module is a product of cyclic R-modules. 


This is the exact analog of the structure theorem for abelian groups! Let’s translate this statement to one about 
linear operators: we know that we have a linear operator T : V + V. As a module, V is generated by one element v, 
so every w € V is of the form w = g(t)v for some polynomial g(t). Writing g(t) = t* + by T*"1 +--+ + bo, we 


have w = t*v + by_ith~!v +-+++ bov. Since our linear operator T corresponds to multiplying by t, this can also be 


written as 

WET Us heal” (Vaso Bey, 
so v,Tv,72v,-:- span V. If V is isomorphic to R/(0), then there’s no relations: V has an F-basis vo = v, Vy = 
Tv, vo = T?v,---, and T is the shift operator between the basis elements. On the other hand, if V is isomorphic to 


R/(fR) for some polynomial f = t” + anp_1t"-+ +--+ + ao, we have the relation 
Va + An-1Vn-1 + °° + AQVo = 0, 


and thus we have an F-basis vo,--- , Vp_1. Then the matrix of T has 1s directly below the diagonal and —ap, —a1,--- , —@p—1 
in the last column. (This is called the rational canonical form of the matrix.) 

So looking at our structure theorem, if k is equal to the number of generators (so we don’t have infinitely- 
dimensional vector spaces), V is the product of finite-dimensional vector spaces with dimension equal to the degree of 


the corresponding fis. 


Corollary 191 


If V is a finitely-dimensional vector space, and T : V — V is a linear operator, then there is a basis for V such 


that the matrix M for T has the form 


where each 8B; Is in rational canonical form. 


This is the best we can do over an arbitrary field! (Jordan normal form is nicer for C.) 


Fact 192 
Professor Artin recommends that we don't take pictures of the whiteboard. There is a connection between the 


mind and hand, so we should take notes on paper instead. (Hmm... maybe | should stop taking notes on a laptop. ) 


24 April 12, 2019 


We're going to discuss fields today — most of the discussion centers around containing one field in another one. 


Definition 193 


If we have fields F, K, and F C K, then K is a field extension of F. 
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Example 194 
We can take F = Q and let K = F[6], where 6° = 2. Then elements of K are of the form 


K = {a+ b6: a, be F}. 


We can think of K as a vector space over the field F: in this case, it has dimension 2. (Addition and scalar 


multiplication by F in K are just the standard addition and multiplication, since elements in F are in K.) 


Definition 195 


The dimension of K as an F-vector space is called the degree of K over F, denoted [K : F]. 


Some examples: 
[C : R] =2, Chas basis (1, /), 


[(Q[V2] : Q] =2, Q[Vv2] has basis (1, V2). 


Here’s one thing we can do this: let K/F be a field extension, and let a € K. We can then map polynomials 
o: Fix] 3 Kk 


by sending elements of the field to themselves and sending x to a. This helps us analyze “part” of K: the image of ¢ 


here is all elements of the form 
{@ € K : B can be written as a polynomial in @ with coefficients in F}. 


On the other hand, the kernel ¢ is all polynomials g(x) such that g(a) = 0: this is an ideal, and since F[x] is 
principal, this means it is a principal ideal. There are two possibilities: if the kernel is trivial (only the zero polynomial), 
then a is transcendental. That means it’s not the root of any polynomial g(x) with F-coefficients! On the other 
hand, if the kernel is generated by some polynomial f, then f is irreducible: if we could write it as f = gh, then 
f(a) = g(a)h(a) = 0, and then either g or h would be a generator with lower degree. Therefore, the ideal (f) is a 
maximal ideal, and we'll study this case in more detail. 

So going back to the image, if a is transcendental, our map @ is injective, and then K is just isomorphic to F[x]. 
However, if a is not transcendental, by the first isomorphism theorem, F[x]/(f) is isomorphic to the image of ¢, and 
since (f) is maximal, this is a field. Then the image of ¢, which is Fla], is the set of elements 6 € K that can be 


written as a polynomial in a. 
Question 196. How do we compute in F[a], which is isomorphic to F[x]|/(f), for some irreducible polynomial f ? 
We can divide out by the leading coefficient, since F is a field, so let’s say 
a ee ee ae ee ee 


Since $(f) = 0, 
F(a) =a" + ayia” > +--+» +a) =0, 


and that means we can write a” as a linear combination of 1,a,--: ,q@”"!: 
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Proposition 197 
F[a] has an F-basis 


if the minimal polynomial f has degree n. 


So if we are computing in F[x], we have combinations of basis elements with F-coefficients: adding is done 
component-wise, and if we have polynomials 
g(x) = bo + byx +--+ + bp-1x"7, 


hoy a+ exes paqax 4 


then g(a) and h(a) are arbitrary elements of Fla]. Then we multiply g(a) and h(a) by using the division algorithm: 


we can first write 
g(x)h(x) = F(x)q(x) + r(x), 


where r(x) is a polynomial with degree less than n, and then say that 
g(a) h(a) = F(a)q(a) + r(a) = r(a). 


Notably, we can always compute inverses in F[a] because we have a field, but this isn’t immediately obvious from the 
form of our elements! 

Usually when we have an extension K of F with finite degree, we can generate It with one element, but the proof 
is tricky: we'll defer it to later. Let’s think about how a construction of a field extension can be done more abstractly: 
we start with a field F, and we find some irreducible polynomial f € F[x]. This forms a field K = F[x]/(f) because 


(f) is maximal, and if the residue of x is a, then K = Fla]. 


Example 198 


Take F = Fo, the field with two elements {0, T}. 


Recall that x? +x +1 is irreducible in F [x], because it doesn't have a root. So now K = F[x]/(f) has an F-basis 
(1, a, a7), and we know that a? = —a—1=a-+1 in our field. 


Let's find the inverse of a2 +1. This means we want to find a polynomial such that 
(1+ a7)(c + aa+ Oa’) =1. 


This means (using a? =a+1), 


so equating coefficients on both sides, 


(ata)t(atatoaat(e+aqt+oaja? =14+0a+0a?. 


Soc =0,q =1,c% = 0: indeed 
a(a? +1) =a%+a=1. 


(We probably didn’t need to do all that to find the answer...) 
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The degree of our field extension has two main properties: 


Theorem 199 
If K = Fla], then the degree [K : F] is the degree of the irreducible (minimal) polynomial of a. Also, if we have 


fields F C K CL, then 
Ee al Ee eclilies tell 


Example 200 
Let F = Q, K = F[6] where 6% = 2, and L = K[e], where €? = 3. 


Then the degree of L over F is 
[L: F] =[L: K][K: FI]. 


x? — 3 is irreducible over F (for example, by Eisenstein). What's the irreducible polynomial of € over K? We know 
that K contains F, so x? — 3 is a polynomial with coefficients in K, and it has € as a root. The set of all polynomials 
with € as a root is a principal ideal, so if the minimal polynomial were f, f would need to divide x? — 3. But x? — 3 
doesn’t have any roots in K, either: we could for example, just write € = a+ bé and expand. If x? = 3 had a root, 


then we must have 


3 = 6 = a? + 3a°bd + 6ab* + 2b°6 => 3a°b + 2b =0, a? + 6ab* = 3. 
We can amuse ourselves by showing there aren't any solutions here for a, b € Q, so [L : K] = 3, and therefore 
[L: F)=[L: K][K: F] =3-2=6. 
So what's an F-basis for our field extension L? We know that 
L = Kle] = F[6, €]. 


Since 6 has degree 2 and € has degree 3, a natural basis to consider is {1, 6, €, de, e, de>}. It's not hard to prove that 
this works! 

In general, if we have F, K, L such that [L : K] =n and [K : F] =, let (a1,--+- ,Q@m) be a F-basis for K, and let 
(G1,--- Bn) be a K-basis for L. We claim the set of {a6;}s is an F-basis for L, and this isn't too hard to check: for 


any y € L, we can write it as 
Y= P2 Bj 
j 


for nj; € K, and then we can write each 7; out as 
y= xs CijQi, 
i 


for cj € F. So now we just have 
y= a cj iBj, 
ig 


as desired! Showing that this is unique is not too hard: basically, each step was unique, so our entire process is unique. 


This has a nice corollary which we'll illustrate now: 


#1 


Example 201 


Is 6 = V2 in the field Q(w = 7]? 


We know that w is a root of x° — 7, which is irreducible in Q, so the degree of Q[w] is 5. If 6 € Q[w], then we'd 
have Q C Q[d] C Q|w]: this can’t happen because 2 doesn’t divide 5. This kind of argument doesn’t always work, but 
it's pretty efficient when it does! 
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Recall that if we have fields F C K, then K is called a field extension of F. This is useful when we think of the degree 
of the field extension [K : F], which is the dimension of K as an F-vector space. Specifically, if we have FC K CL, 
then 

[L: F] =[L: K][K: FI]. 


One way to look at field extensions is to take some element a # 0 € K and consider the map 
o: Fix] 3K 


which sends x + a. If the kernel of @ is not just zero, then @ is called algebraic (otherwise it is transcendental), and 
since we have a principal ideal domain, ker 6 = (f) for some polynomial f. (f) is a prime ideal generated by a monic 


irreducible polynomial f(x), so (f) is a maximal ideal. Thus, by the first isomorphism theorem, 
F[x]/(f) = Fla] c K 


forms a field, and Fla] has an F-basis of (1,a,--- ,a~+), where the degree of f is n. This means that the degree of 


a over F is (by definition) the degree of the minimal polynomial f. So we have a chain 
FCFla] ck, 


where the degree [K : F] is divisible by the degree [F[a] : F] = n. 


Corollary 202 


For any element a € K, the degree of a over F divides the degree of K over F. 


Specifically, if [K : F] = pis prime, [F[a] : F] is either 1 or p. If it is 1, that’s the same as saying that a is actually 


an element of F. Otherwise, we actually have K = Fla] — a@ gives us the whole field extension. 


Example 203 


Take F = Q. x° — 2 is an irreducible polynomial by Eisenstein, so if a is some complex root of x° — 2, then we 


must have [F[a] : F] = 5. In addition, any 8 € Fla] (which isn't an element of F) is the root of an irreducible 


polynomial of degree 5. 


The “irreducible” part of this isn’t immediately obvious! For example, if we took 8 = 1+ a2, we can write 
B*, B°, B*, B> in terms of the basis elements (1,a,a?,a?,a*). These six equations are dependent, but it’s not 


immediately clear that the resulting polynomial in @ is irreducible. 


#2 


Question 204. What if f(x) is an irreducible polynomial of degree 4 which is irreducible over F = Q? 


We'll start the argument the same way: take a to be a root of f. The field extension has degree [Fla] : F] = 4, 
which is not a prime. Is there a field L such that F C L C K where [K : L] = [L : F] = 2? The answer actually 
depends on f, and it’s a hard question, so we won't answer it here (because we need Galois theory). We'll just do an 


example where such a field does actually exist: 


Example 205 
Let ¢ = e27//5 be a fifth root of unity. Over Q, its irreducible polynomial is 


xe al 
=x 


x—1 


This has roots ¢, €?, €3, C4. Now ¢ is not a real number, but ¢ + C4 is! Letting this be a, F[a] is smaller than the 
whole field (because it has only real numbers), and also @ is not rational. Thus F < Fla] < F[¢], and therefore we 
must have [F[¢] : F[a]] = [Fla] : F] = 2. 

How do we find the irreducible polynomial for a in Q? The most straightforward way is to write down the powers 


and write down a linear relation between them. Turns out we won't have to go far: 
a® =1 
oe a 
rac +2 


and now note that €+ ¢2+¢3+¢4 =-1, so 


ato?=-1+2=1. 


That means @ is a root of the polynomial x2 + x — 1, anda = ae (it’s the positive root because ¢ and ¢* have 
positive real parts). Specifically, our field F[a] = F[W5]. 


With that, we'll move on to a related topic: constructions with a ruler and compass. Here's some rules: 


+ Our ruler is actually a straightedge (it doesn’t have inches or even centimeters). 

¢ We start off with two points given in the plane that are “constructed.” 

+ Given two points po, Pp; that are constructed, we can draw a line L(Ppo, pi) through them. 

+ We can also draw a circle C(Ppo, pi), which is a circle with center po that passes through py. 


+ Intersections are also constructed points. 


This is all we're allowed to do! What can we construct this way? First of all, there are some elementary constructions 


that we might have learned in high school. 


Example 206 


If we construct some line L and some point p, can we construct the perpendicular line to L passing through p? 


We don't want to pick arbitrary points on our line — that’s not so elegant. But we know that if L is constructed, 


there are two points on it: one is not directly below p, so we can just draw the circle passing through it with center 
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p. Now we can just draw circles through those points (with the other point as center) and get another point on the 
perpendicular bisector: connect them and we're done! 


Similarly, we can construct a parallel line by doing two perpendiculars. 


Example 207 


We have two points 2 apart: how can we construct a point on a line 2 away from a given point? 


If we have P, and P> off the line and A on the line, construct parallelogram AP;P2B (by using parallels): now 
AB = @, and draw a circle through point B with center A. 
To make more progress, we'll introduce Cartesian coordinates! We can assume our starting points are (0,0) and 


(1,0) — we can draw our x and y-axes with this by using the perpendicular trick. 


Definition 208 


A real number z is constructible if we can construct points po, p; that are a distance |z| apart in our coordinates. 


Lemma 209 


Po = (Xo, Yo) is constructible if and only if x9 and yo are constructible real numbers. 


Think of this as constructing (xo, 0) and then (xo, yo). 


Lemma 210 
Suppose Po, Pi are points with coordinates in a field K. Then L(po,pi) and C(ppo, pi) have equations with 
coefficients in K: 

ax bye — Nag) ye) 


where a, b, C, Xo, Yo, FE K. 


Proof. The equation of a line through (x9, Yo) and (x1, yz) is (by point-slope) 
(x1 — x0)(Y — Yo) = (41 — Yo)(x — Xo) 
which has coordinates in K because we have multiplicative inverses. Similarly, the circle has 


(x — x9)? + (y — Yo)? =F? = (x1 — %)? + (41 — Yo)”. 


So we draw a bunch of circles and lines: the equation’s coefficients are still in the field that we're in each time we 


do such a construction. But what do we know about the new coordinates of the intersection points? 


Lemma 211 


If lines and circles have equations with coefficients in a field K, the intersection points have coordinates in K ora 


quadratic field extension of K. 


Proof. Solve the equations! If solutions don't exist, we don’t care, since that just means we don’t add constructible 


points. 
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Two linear equations have a solution in the field by linear algebra (if one exists). If we have a line and a circle, then 
ax+by+c=0,(x-x)?+(y—-ywyr=r 


can be solved by substitution: write y as a linear expression in x, which gives a quadratic equation for x (and y is 
linear in x). Any solution, if it exists, is in a quadratic field extension. 
Finally, we might need to intersect two circles. Usually this looks like a problem, because we should have a 


fourth-degree equation when we have two second-degree equations. But that doesn't happen here: if 
(x — x0)? + (y — yo)? = ie. (x — x1)? + (y-y)? = ce 


we can just subtract the two to get a linear equation. Now this can be solved in the same way as the line-and-circle 


case, SO we again have a quadratic field extension. 


So as a consequence, starting with the rational numbers and trying to make more constructible objects only gives 
us quadratic field extensions of Q. (Note that if any of our equations give no solutions or complex solutions, this 
just means, geometrically, that our objects don’t intersect, so we don’t care about them.) We can state that as a 


formal result: 


Theorem 212 


A real number q@ is a constructible real number if there exists a chain of fields 


O= kp e kre 6 Ky eR, 


where the degree [Kj41 : Kj] =2 anda € Ky. 


Question 213. Given an angle ¥, is it possible to construct @ = 3? 


Given any angle, we can move it over to the coordinate axes, so that we have one of the rays on the x-axis. Our 
goal is then to construct a ray at angle a. Clearly it’s possible sometimes (for example, we can construct the angle 


30°), but it turns out to not be true in general: 


Proposition 214 


It is not possible to trisect a 60 degree angle. 


Proof. Note that if we could construct an angle 6 = 20°, we would also be able to construct a = 2cos@ = eff 4 eH 18. 
By the triple-angle identity or other methods, a will be a root of the polynomial x? — 3x — 1. 


This is an irreducible polynomial over Q, since it doesn’t have an integer root. But the degree of [Ky : Ko] must 


be a power of 2, and the degree of Q[a] : Q is 3, which is a contradiction. 


26 April 19, 2019 


Today, we'll start by talking about adjoining roots to a field F. Let's say we have an irreducible polynomial f(x) € F[x], 
and let’s say we want to ask for a field extension K such that K = Fla], where a € K is a root of f. If our underlying 
field F is a subfield of C, we can just pick a root a € C by the fundamental theorem of algebra. But there are fields 


that aren't subfields of C: for example, what if F = F, or C(t) (the fraction field of polynomials)? 
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Well, if we have an @ in our extension field K, we can construct a map a : F[x] > K, where F is sent to itself and 
xX is sent to a. Then if a is a root of f, the kernel of 7 will be the principal ideal (f), and by the first isomorphism 


theorem, we have our field without needing to define things in terms of q itself: 


Proposition 215 


Let a be a root of an irreducible polynomial f. Then F[x]/(f) = Fla]. 


In particular, f being irreducible implies that (f) is a maximal ideal, so K = F[a] is a field! 


Fact 216 


The residue of x in F[x]/(f), (x) =a, is a root of f. 


Proof. Consider the maps 
FS Flx] 3 FIX/(f) = K. 


The composition here is injective: if a,b both map to the same k € K, then a— b must be a multiple of f, which 
means a = b. Our goal is then to check that a = (x) is a root of f. Note that m(f(x)) = 0 (since m mods out by 


f). On the other hand, if f(x) = x" + anp_1x""+ +--+ + ao, where the coefficients are in F, we have 
O = (Ff) = w(x)" + W(an_1) a(x)” 1 +--+ + (ap). 


nm of any coefficient is just that coefficient itself (since we identify F with its image in K), and a(x) = a. Plugging 
everything in, we find that 


0=a"+ ayia”) +++++ a9, 


and thus q@ is a root, as desired. 


It's important to note that in K = Fla], elements are polynomial expressions in a, and it has an F-basis 


(1,a,--- ,a@"~+). Then the relation f(a) = 0 is the “only one” that is relevant. 


Example 217 


Let F = Q, and let’s take ¢ = e2”//7, a seventh root of unity. 


This is a root of the polynomial 
x? 1 


=x? +x°p---+x41. 
x1 


Thus, F[¢] has basis (1,¢,--- ,¢°), and we have the relation 


CO+O4--4+¢41=0 


as the only linear relation between these powers of ¢. Note that the roots of f(x) = 0 are ¢,¢?,--- ,€°, so they're all 


roots of this irreducible polynomial &,. Plugging in -y = ¢?, we also have 
6 5 = 
Sa ae ey re 


but this is the same relation as the original one if we plug things back in (C9 + C2 +€ + ¢6+¢4+¢7+1=0). So 


from the point of view of the rational numbers, there’s really no difference between y and ¢. 
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Fact 218 


Both of the following follow from the logic above: 


* If @ is a root of an irreducible polynomial f in K, and a’ is a root of that same f ina field K’, then there 


exists a unique F-isomorphism ¢@ : K — kK’ (which means that @ is the identity on F), sending @ to a’. 


* On the other hand, if we have an F-isomorphism K — Kk’, and @ is a root of an irreducible f(x) € F[x], 


then ¢(a) = a’ is also a root of f (by applying @ to the polynomial relation for a). 


Proposition 219 
Take any (not necessarily irreducible) polynomial g(x) € F[x]. Then there exists a field extension K in which g 


factors into linear factors: that is, g splits completely. 


Proof. Induct on the degree of g. Choose an irreducible factor f|g, and adjoin a root a, of f to F by the method 


above. Now F, = Fla]. x — a, is a factor of f, which is a factor of g, so now we have g = (x — a1)g1. The degree 


of gi is smaller than g, and we're done by induction. 
Concretely, we'll get some chain 
FC Flay] C Flay, a2] C Floi,---: , ap] = K. 


The degree of the first extension is at most n, and the degree of the second one is at most n—1, and so on. The 
last one is free: if we have n— 1 of the roots, we get the last one with a degree 1 extension, which means we stay in 


the field itself. This leads us to the following result: 


Proposition 220 
For any splitting field K of a degree n polynomial over F, [K : F] < nl. 


This is not a very good bound, but it’s the best we can do. 


Example 221 


Let’s take F = Q, f(x) = x3—2. We can use our complex roots a = Oh. Q@> = Wa1,a3 = w? 


a1, where w = e27//3 


is a cube root of unity. 


Note that [F[a,] : F] = 3, and then the polynomial we're left with is 
x? —2 = (x —a1)q(x), 


where q(x) is a quadratic with complex roots a@2,a3. Now [F[ay, a2] : Flay]] < 2: in this case, a, is a real number, 


and w is not real, so adjoining a2 is not free. So here we have 
[K: F]=3-2-1=6. 


So if f is a cubic polynomial with rational coefficients, it always has at least one real root. The only case in which it 


might have [K : F] = 3 is if all three roots are real, and even then it still depends on the specific polynomial. 


a 


Example 222 
Let F = F> = {0,1} be the finite field on 2 elements; recall that the irreducible cubics are f = x3+x+1and 


fl=x34x?41. 


Let's adjoin a root of f = x? +x +1: we have [F[a] : F] = 3, so Fla] is a vector space of dimension 3 over F, 
and thus 
K = |F[a]| = 23 = 8. 


The elements are then 


{0,la,l+a,a?,1+a?,a+a?,1l+a+a7}. 


Let 6 be any of the elements that is not 0 or 1. Note that adjoining 6 to F yields some subfield of K. But the degrees 


satisfy 
[K:F]=3=[K: FIBII[[FIG] : FI, 


and since 6B ¢ F, we must have K = F[6]. Thus 6 must be a root of either x? +x +1 or x?4+x?41. 

Well, we have six elements that aren't O and 1, and there are two irreducible cubic polynomials: this means that 
f, f’ both split completely over K, and each of them has three of these elements as roots. 

How do we find which ones belong to f = x? +x +1? Note that a, a root of f, satisfies a? = a+1. Since we 


have (a+ bye =a+b? in Fo, squaring both sides of our relation, we get 
a& =a? +1. 
This means that B = a? => 6B? =6+1, which means that a, a2, a+ a? must be the roots of Ff. 


Example 223 


Let's go back to our field L = F[¢], where ¢ is a seventh root of unity. 


Taking a=¢€+ ¢°, a real number, we have the chain 
FCFla] cl. 
[L : F] =6, and [L : F[a]] > 1 because F[a] contains only real numbers. Furthermore, 
(x= OO = C) =x? ak +1, 


so [L : F[a]] = 2, and [F[a] : F] = 3. To find the minimal polynomial for a, We can compute powers of @ until we 


get a relation: 
a° = 1,01 =€+¢5,o7 =(42+0,a% = +4+36430°+¢4, 


and now we have the relation 
a? +a?—2a—-1=0, 


so a is a root of x2 + x? — 2x — 1. To find the other roots from here, note that we can let Y= C2: then this has all 


the same properties as ¢ from the perspective of the rationals, so a2 = y+y® =|¢€7 + C° | is a root as well, and so is 


a3 =|¢2+¢*|. If we don't believe this, we can always expand out (x — a1)(x — at2)(x — a3): 


= x? — (ay + a2 + 03)x? + (a1a2 + 4103 + A203)x — (10203). 
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The x?-coefficient is —(—1) = 1, the x-coefficient can be done trickily: aya. is a sum of four powers of ¢, and so 
are the others. This means we have 12 terms, and so we'll have a —2 there (because the sum of the seventh roots 
of unity except ¢° is —1). Finally, a,a2a13 is eight powers, and it can include C°. So it must be 2 copies of ¢° and 1 


copy of all the others, so that yields 1. We've now sort of verified that this works! 


27 April 22, 2019 


We're going to discuss finite fields today. This is a fun topic, because it’s actually pretty hard to satisfy all of the 


properties needed. Let’s start with two preliminary ideas first. 


Definition 224 


Let f(x) = a,x" +---++ a9 be any polynomial in F[x], where F is a field. Then the derivative of f is 


f(a Nanos i Wank see cota 


Here, we must interpret n=1+---+1. 


This satisfies the familiar calculus rules: 
(f+ aatog, 
» (fg! = fat g'f, 


* (cf)! = cf". 


Proposition 225 


An element a € F is a double root of a polynomial f(x), which means that (x — a)? divides f, if and only if a is 
root of both f and f’. 


Proof. Clearly a needs to be a root if it is a double root. This means that (x — a) is a factor of f, so we can write 


f(x) = (x — a)q(x). 
Now by the product rule, 
F!(x) = q(x) + (x — a) q(x). 


Suppose a is a root of f’: this happens whenever 0 = f’(a) = q(a) + (a — a)q’(a) = 0, so q(a) = 0. This means 
we can write 
f = (x—a)?r(x), 


and thus @ is a double root. 


Let’s do an example: suppose f(x) = x”? — 1. The derivative is f’(x) = nx", and the only root of f’ seems to 
be 0 (since we have a field), which is not a root of f. This means f has no double roots unless the polynomial is 
identically zero, which happens if 7 = 0 in our field! In particular, this means that f = x? — 1 has derivative zero if 
F = Fp. In particular, 


xP —1=(x-—1)P 
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by the Binomial theorem mod p, since () is always a factor of p, and p is always odd except for p = 2 (which also 


works). 


One other preparatory idea: 


Theorem 226 


Let F be a field, and let H be a finite subgroup of the multiplicative group F*. Then H is a cyclic group. 


For example, the integers mod p always form a cyclic group under multiplication! Letting p= 7, F =F,,H = F* 
gives a cyclic group of order 6: the generator is not 23 = 1, but the powers of 3 are {3, 2,6, 4,5, 1}, so 3 generates the 
cyclic group F?. On the other hand, the powers of 2 mod 13 are {1, 2,4, 8, 3, 6,12, 11, 9,5, 10, 7}, so 2 is a generator 
in this case. Generators are called primitive roots. 

It's not really well understood which elements are primitive roots, but this theorem tells us that there exists one 


(because we have a cyclic group). 


Proof. By the basis theorem for abelian groups, every finitely generated abelian group is a product of cyclic groups. A 
finite group is obviously finitely generated (by its elements), and recall that the proof tells us when we diagonalize our 


integer matrix, we can make the entries divide each other. So that means 
H= Ca, x Ca Knee Cae; 


where d;|d2|---|dx. Let's count how many elements of order dividing d, there are in this group. There are d, ways to 
pick a representative from each of the Cgs, so there are di elements of order dividing d;: thus, they are all roots of 


xh — 1. 


Lemma 227 


In any field, a polynomial of degree d has at most d roots in F. 


Proof of lemma. Take any root a ,: then f(x) = (x — a1)g(x), and g has degree d — 1. By induction, g has at most 


d—1 roots, so we're done by unique factorization. 


So now dy < dj, and therefore k = 1. 


We're ready to start talking about finite fields now! Let K be a finite field with |K| = gq. We always have a unique 
homomorphism of the form 
Ee: Zk. 


ker € is a principal ideal, since Z is a principal ideal domain, and now ker € must be generated by a prime element. This 
means ker € = pZ for some integer prime p. Meanwhile, the image is isomorphic to F = Fy, so K contains some F, 
and both of these are fields. Thus [K : F] is some integer e, so the order of K is |F|® = p®, and that means q = p* 
must be a prime power! 

It doesn’t seem clear that K = Fy, either exists or is unique at this point, though. Well, K* is a cyclic group with 
order gq — 1 by Theorem 226. Thus, the elements of K* are roots of x?~! — 1, and thus x7~! — 1 splits completely! 


It’s customary to multiply the factor (x — 0) back in to get a more symmetric result: 


Proposition 228 


All elements of K = F, are roots of the polynomial x? — x, which splits completely in K. 
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Theorem 229 


Let p be a prime, and let e > 1. Then there exists a field K with |K| = p© = q, and all such fields are isomorphic. 


Proof. We show existence first. Start with the field F = F,: there exists a field extension L of F where the polynomial 
x? — x splits completely by Proposition 219. Let K be the roots of x7 — x: we just need to show that x7 — x doesn't 


have a double root. Taking the derivative, 


f(x) =x? —x OOS a 1S], 


because q is a multiple of p. This is never 0 in the field F[x], so there are no double roots of x? — x. So we do have 


|K| = q elements, and now we must just show that if a, 6 are in our field, so are a+ 6, a6, and a~!. We have that 
(B)% = a4 64 = a6, 
since x? = x for all elements in the field K, and thus a is in K. Similarly, 
CC ae 
Finally, (@ + 8) is a bit more difficult: we have to use induction. If q = p® = pk, where k = p+, then 


(a +6)? = [(a +6)". 


By the Binomial theorem, we can expand out (a + 8)? to a? + B?, and now this is 
[o? + 6?) 


and now we're done by induction, since this is equal to a? + 6° = a+, as desired. Similarly, replace 6B with —B 
to get that a — BG is in our field as well. 

Now we need to show that the finite field is unique. Let K;, K2 be two fields of order q = p®: the multiplicative 
group Ky is cyclic of order q — 1, so we can let a; be a generator. Now the elements of Ki are 0 and powers of au, 
ie) 


K, = Flay], 


where ay is a root of an irreducible polynomial f(x) € F[x]. Furthermore, the degree of f is the degree [K; : F] =e. 
But a is also a root of x? — x, and f is irreducible, so f(x) must divide x4 — x. 


Now looking in Ko, x7 — x must split into linear factors, so f has a root a2 in Kz. Thus 
Ky = Flay] = F[x]/(f) = Flae] C Ko. 


But now the degree of K2 over F is e by definition, and the degree of F[a2] over F is the degree of f, which is also 


e. So Fla2] = Kz, and now kK, and Ko are isomorphic as desired. 


Here's the last main fact that we care about: 


Proposition 230 


If K is a finite field with order q = p® and |k’| is a finite field with order g’ = p®, then K contains a field 


isomorphic to kK’ if and only if e’Je. 


This is a slight generalization of the idea that fields of the same order are isomorphic! 
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Proof. \t's clear that if K contains K’, then e’ has to divide e: after all, 
[K: Fl =[K: K’][k’: F]. 


To show the other direction, we can write e = e’d. Then u®’ — 1 divides u® — 1 (where u is a variable), since we can 
write v = u®, and then v — 1 divides v’ — 1. So now q’ — 1 divides q — 1, since g’ = p® and q= p*°. 
Now we can show that x?! — 1 divides x9~1 — 1: this is true because g’ — 1 divides q — 1, and we can apply the 


same logic as above. Finally, multiplying by x, x7 — x divides x7 — x, and therefore all roots of x? — x are in K, so 


K’ is contained in K. 


One final question: what are the irreducible factors of x? — x in F[x], where g = p®? The answer turns out to be 


all irreducible polynomials whose degree divides e. 


Example 231 


Take q = 24 = 16. The irreducible factors of x1® — x in F2[x] are 


x? = x(x +1)(x?+x41)--: 


where the rest is a product of three irreducible quartics, since we must have degree 2° where e is a divisor of 4. 


So this is a way to count the number of irreducible polynomials of a certain degree! They turn out to be 


eed ee da OC a ee, 


We'll go over this next time! 


28 April 24, 2019 


Let's finish discussing finite fields today. As a review: if we have a finite field K with |K| = q, we must have q = p*, 


where K D F = Fo, the field of p elements. Then [K : F] = e, and we have a few important properties: 
+ K* is a cyclic group of order q— 1. 
+ The elements of K are the roots of x? — x (which has no double roots). Specifically, x? — x splits in K [x]. 
+ For all g = p®, there exists a unique K with |K| = q (up to isomorphism). 


- If q’ =p”, and we have a finite field K’ = Fy, then K’ C K exactly when e’ divides e. 


Corollary 232 


If we factor x7 — x in Fx], then the irreducible factors are all irreducible polynomials f(x), such that the degree 
of f divides e. 


Proof. Say that f is an irreducible polynomial in F[x] with degree d, and let a be a root. Then [F(a) : F] =d, and 


thus f dividing x7 — x means f has a root in K: we can then set up a chain 
FCF(a)cK 


and now since [K : F] = e, [F(a) : F] = d, we must have d divide e. On the other hand, if die, then x7 — x divides 


x7 — x, so we indeed have a subfield Kk’ of K. 
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Let’s do some applications of this: 


Example 233 


Factor x°’ — x in F3[x]. 


Since 27 = 3°, the irreducible factors must be irreducible polynomials of degree 1 or 3 (those are the divisors of 
the exponent 3). We know the linear polynomials: x?’ — x is divisible by x,x +1,x — 1, and this leaves degree 24: 
luckily this is divisible by 3. 

So the rest is a product of 8 irreducible polynomials of degree 3. A polynomial of the form x? + ax? + bx +c is 
irreducible when it has no linear factors, so 0,1, —-1 must not evaluate to 0. But that’s annoying to calculate — we 


won't do it here. 


Fact 234 


x7 — x has no repeated roots in F,[x], so each factor can only occur once. 


Example 235 


How many irreducible polynomials are there of degree 5 and 10 over F5? 


Note that 2° = 32, so we want to factor x22 


— x. All factors are either linear or degree 5, and the linear factors are 
x and x +1. This leaves an exponent of 30: therefore, there are 6 irreducible polynomials of degree 5. (Remember 
that any irreducible polynomial does need to be included in the product, because the roots are roots of x°* — x.) 
Meanwhile, the number of irreducible polynomials of degree 10 over F> can be found similarly: since 2*° = 1024, 
and we can have only polynomials of degree 1,2,5,10. Since x,x + 1,x? +x +1 are the low-degree irreducible 


polynomials, and we have 6 of degree 5, this leaves an exponent of 
1024-—-1-—1-—2-—30= 990, 


and thus there are 99 polynomials of degree 10 over Fo. 


What are the six irreducible polynomials of degree 5 here? They're of the form 


(since the constant term can't be 0). Since 1 can’t be a root, we must have an odd number of a, b, c, d be equal to 


1. Then we need to make sure there aren't quadratics that work: turns out x° + x4+1,x>+%-+1 are reducible, and 


the other six work. 
For the rest of class, we'll assume our fields F have characteristic zero: that means adding 1 to itself never gives 


us 0 (so we don't have something like F,). This means that F contains Z and therefore also Q. 


Proposition 236 


Let f(x) be an irreducible polynomial in F[x]. Then it has no multiple root in any field extension. 


Proof. lf f has a multiple root, then it is a root of both f and f’. The degree of f’ is one less than the degree of f, 
and f is irreducible in F[x], so f, f’ have no common divisors in F[x]. But (x — a) would need to be a common divisor 
in K[x], because f has to have a factor of (x — a)? and thus f’ has a factor of (x — a). We use a lemma here to 
finish: 
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Lemma 237 


If F Cc K and f,g € F[x], then the greatest common divisor in F[x] and K[x] is the same. 


Proof. Let d = gcd(f, g) in F[x], and d’ = gcd(f, g) in K[x]. F[x] is a subring of K[x], so d will divide both f and g 
in K[x]: this means d divides d’. On the other hand, if d = pf + qg for p, q € F[x], this statement is also true in the 


ring K[x]. Now d’ divides f and g, so d’ divides d. This means the two gcds are equal, as desired. 


So if (x — a) is not a common root of f and f’, then (x — a) can’t be a common divisor in K[x] either, and thus 


there is no multiple root. 


Let’s move on: the next idea is sort of tricky. 


Theorem 238 (Primitive Element Theorem) 
If F Cc K isa field, and [K : F] < oo is finite, say that - € K to be primitive if K = Fla]. If F has characteristic 


0, then there exists a primitive element for the field extension. 


Proof. \f K is a finite extension, we can always adjoin a finite basis to F: let’s say K = Flay,--- , a,x]. We induct on 
k: if k = 1, we're done. For the inductive step, since K D Flay,--- ,@,%—1] = K’, we can assume K’ has a primitive 
element 8. So now we just need to show that K = F[a, 6B] (generated by two elements) has a primitive element. 

Let y = 6+ ca (c € F is a number to be determined). Our goal is to show that K = Fla, 6] is Fly] for most 
choices of c (in fact, all but a finite number). Take f(x) to be an irreducible polynomial in F[x] with root a (of degree 
m), and let g(x) be an irreducible polynomial in F[x] with root 6 (of degree n). We can construct a field extension of 
K in which f and g split completely: then if the roots of f in L are a1,--- ,Q@m (where a = a1), and the roots of g 
in L are 61,--- ,Bm (where B = B;), all as here are distinct by Proposition 236, and so are all Bs. 

So if y = 6+ ca, the field K’ = F[4] is some subfield of K. Our goal is to show that K’ = K. Here's the trick: a 


is a root of f, and we can find another polynomial in K’[x] with root a as follows. Since 6 = y — ca, the polynomial 
h(x) = 9(y — cx) 


has coefficients in K’[x] (since c,y € kK’), and it has a as a root (since h(a) = g(B) = 0). So now a is a common 
root of f and hin K’[x]: we claim it’s the only one! If a; (another root of f) were also a root of A(x), then we must 
have 


h(ai) = g(y — cai) = 0, 


which is true if and only if y — ca; = 6; for some j, which means 
Bi + cay — ca = Gj. 


This means 


Bj — Bi 
CSS 
Aj — Ay, 


and remember that we can pick c now: just pick it to not be any of the values for any /,/. (A field of characteristic 
zero has infinitely many choices for c.) So now ay, is the only common root of f and h, which means that the gcd of 
f and his (x — ay) in K’[x]. Therefore a; € K’, which means that y, 6 € K’, and thus K = Fla, 6] C kK’. 


Picking a suitable c, K and K’ are the same field, as desired, completing the inductive step. 
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Fact 239 


This theorem is false in non-characteristic zero fields! For example, consider F = F,[t], and let f(x) = x? — t. 


This is irreducible (with a similar argument as Eisenstein’s criterion), but f’(x) = px?! = 0. So all roots are 
multiple roots: in fact, this has just one root in any field extension. So certain fields of degree p* do not have a 


primitive element (if we adjoin two such pth roots). 


29 April 26, 2019 


We'll get started on Galois theory today. Note that today we'll deal with characteristic 0, so that all irreducible 
polynomials have distinct roots. 

Let f be an irreducible polynomial in F[x], and let's say @ is a root of f in some extension K of F. Then Fla] is 
easy to compute in, because it is just polynomials in a with basis (1,--- pant), where n is the degree of f, and we 


have the relation f(a) = 0. But it’s not so clear how to compute with more than one root at a time at the moment. 
Definition 240 
A splitting field K of F is an extension of F with two properties: 
- A polynomial f(x) splits completely over K (that is, all of its roots a1,--- ,Q@, are in K). 


+ We can write K = F(aj,--- , Qn), so all elements of K can be written as a polynomial in the a;s with 


coefficients in F. (This might not be unique, though.) 


Now computing in a splitting field requires knowing how the roots are related, and that depends on our polyno,ial 


Example 241 


Let’s say f(x) = x* + bx +c is some quadratic polynomial, where b,c € F. 


Then the roots a1, @2 are in some splitting field K, and we want to say something more about how they are related. 


Definition 242 
An F-automorphism of a field extension K is an isomorphism o : K — K back to itself, such that o is the identity 
on F. 


Lemma 243 


There exists an F-automorphism such that o(a1) = a2,0(a2) = a1. 


Proof. Complete the square: we have 


b? —4c 


oly) = Fly — 3b) =(? — by + F) + by —- 3) += + 


This has roots y, —y, where ¥ is 3VD (D is the “discriminant”). Then adjoining y is equivalent to adjoining a, 
because the two differ by an element of F. Then the F-automorphism that sends -y + —+¥ (and acts as the identity 


on F) sends a, to a2 and vice versa, as desired. 
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Similarly, if f(x) is an irreducible cubic with roots a1, @2, a3 in a splitting field K, there exists an F-automorphism 
such that o(a1) = Q2,0(Q2) = a3,0(a3) = Q,. It’s not as clear how to construct thism though — it is true that a 
root has to go to another root under any such o because the relation f(a) = 0 still needs to hold true, but we don't 
necessarily know that we will get this specific permutation. We might return to this later. 


Recall Vieta’s formulas: if we have a product of the form 


(x — ur)(x — U2) ++ +(x = Un), 


we can expand it out as 


x? — 5x 4 ox? —..- +5), 


where 
Sy = Uy tes Un = S ui, 
i 


Sp = UyUg Fe = S UjpUj, 
i<f 


and so on, where we basically pick k of the ujs if we have an x”-*. These are called the elementary symmetric 
functions, because they don’t change when we permute the us. In other words, we can think of G = S,, the 


symmetric group, as operating on F[uy,--+ , up| (though F can really be any ring), with the rule 
OU; = Uo(i) 


(Sp is operating on the indices). 


Definition 244 


A polynomial g(uy,--+ , Un) is symmetric if 


In particular, this means 


O(G( ty, 2** , Un)) = Gta plies). 


Theorem 245 (Symmetric Functions Theorem) 
Every symmetric polynomial g(ui,--- , Un) can be written in just one way as a polynomial in the elementary 


symmetric functions. Specifically, there exists a polynomial G(z,,--- , Z,) such that 


glu) = G{s1,+>* , $n). 


Example 246 


Consider g(u) = u? +--- + u2 (the sum of n squares) 


For n= 2, we have 51 = uy + Uo, So = Uy Ue, and now g(u) = s? — 2s). For n= 3, we have 5; = uy + Uo + U3, 5 = 
Uy U2 + U1 U3 + U2U3, $3 = UyU2U3. The third symmetric polynomial here is useless because the degree is too large; let's 
try to do the rest systematically. 


If we set uz = O, we still have a symmetric polynomial in uw, and us, and the expression for G reduces to the 
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elementary symmetric polynomials in two variables. Matching degrees tells us that we must have g(u) = as? + bso, 


and it’s Se — 2s] in the degree 2 case, so this must be true for all n > 2. 


Example 247 


What if we take g = up +--- + u3? 


For n= 1, we have g = Ss}. For n = 2, monomials of degree 3 are = and s,5>. Therefore, 
3 3 2.3 
ue + ud = as? + bs 


for some constants a, b. Set U2 = 0, and bs,So goes away, so a= 1. To find b, we can set uy = U2 = 1. Now the left 
side is 2, and the right side is 8+ 2b, so b= —3. 
Finally, let's do n = 3 and write uP + is + uz in terms of symmetric functions. If we set uz = 0, we just remove 


the s3-only terms from the picture, and since s3 has degree 3, that means we have 
u? + us + ug = sf — 351 + C53. 


for some c. Now plugging in uy = uo = uz = 1, this becomes 3 = 27 — 274+ c, and c =3. 
It should now be pretty clear how this generalizes: add on one variable at a time! We'll write out the proof formally 


now. 

Proof. Given that g(u1,--- , U,) is symmetric, induct on n and on the degree of g. Plugging in up, = 0, define 
Q?(th, +++ Up—1) = g(un.-> , Una, 0), 

and also define the elementary symmetric functions 
SP (U1, +++, Un-1) = Si(U, +++, Un—1, 0). 


Since g is symmetric, g® is also symmetric in the first n — 1 elements, and the sPs are the elementary symmetric 


polynomials in the n — 1 elements (plus the zero polynomial). By inductive hypothesis, 
@ (tis ,Up—1) = G(s? =» a4) 
uniquely for some polynomial G. So now we can write 
tgs lg) S Gir lig) = (S14 Bq) 


(sure, Ss, doesn’t appear, but we can still include it). Note that h(u.,---,Un) = 0 if u, = O by definition of G, 
SO Un divides h(uy,--- , Un). But h is symmetric, because it is the difference of two symmetric polynomials. So if 
every monomial in A contains up, all u; must divide h as well, and thus the product u,u.---u, divides h: we can write 


h(u) = s,q(u) for some other polynomial g (which is also symmetric). And now this can be written as some polynomial 


iN S1,°** , Sp, by induction on the degree. 


There's one more very important symmetric polynomial: 
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Definition 248 


The discriminant of a polynomial P(x) = [],(x — uj) is 


D = (uy — up)? (uy, — ug)? --- = ] J (uj — )?. 


i<j 


D is a symmetric polynomial, because it takes each difference u; — uj to some other one (up to a + sign) and the 


squares fix potential issues with signs. 


Remark 249. Note that this is consistent with the definition of the discriminant for monic quadratic polynomials: 
D=(H=wyS ue + ie — 2Uy Uo = Sy — 25-25 = = — As, 


where 5, = —b, 5. = C, so this is the familiar b2 — 4c. 


Example 250 


What can we say about the discriminant for a cubic (n = 3)? 


In this case, we have 
D= (uy, - Up)? (uy = U3) (Up = u3)*, 


which is a symmetric polynomial of degree 6. This has to be some linear combination of 5°, s#50, 5353, 5255, 515253, 55, 55. 


It's not easy to determine the coefficients, really, but let’s try the systematic method: if uz; = 0, then 
D® = (uy — Up)? uzus = (57)? — 4(s3)?(s3)?, 
so D® = (s?)*(s9)? — 4(s9)3. The systematic method then tells us that 
D= srs =A5) +s(+) 


for some polynomial *: this just leaves some linear combination of 5353, 515953, és. there isn't really a good way to do 
it other than putting in values for the us and getting equations in the unknowns. This is not recommended: they turn 
out to be —4, 18, —27. 


30 April 29, 2019 


Let's review the ideas of symmetric functions: if we have variables u,,--- , U,, then the symmetric group S, operates 
on polynomials F[uy,-+- , U,] as an automorphism: for any 0 € S,, we send uj — Ugi;) under the group action. We 
define a polynomial p(u1,--- , U,) to be symmetric if p = o(p) for all o € Sp. Then we have Theorem 245 from last 
class: every symmetric polynomial p(uy,--- , U,) can be uniquely writen as a polynomial in the elementary symmetric 
polynomials 
$1 = Uy +++ + Un, 
= ye uiUj, 
IJ 
up to 


Sp = Uy- ++ Up. 
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Here's a useful fact that comes out of this: 


Corollary 251 


If f(x) € F[x] is a polynomial that splits in an extension field K with roots a1,-+-+ ,Q@p, let p(u1,-++ ,Un) be a 


symmetric polynomial in F[uy,--- ,U,]. Then p(ay,-+- , Qn) is in F. 


For example, the discriminant of a cubic is (uy — u2)*(uy — u3)2(U2 — u3)*, which is symmetric in Uy, U2, U3, SO 


D(a1, A2, a3) is always in F. 


Slightly more complicated proof than necessary. p(U1,°*: ,Un) iS a polynomial in s;(u),--- ,S,(u) by the symmetric 
function theorem. Thus, we can write p(uy,--+ , Un) = P(s1,--* , Sn) for some polynomial & € F[x], and now we can 


substitute in uj; = Qj. 


For reference later on, let H(x) = (x — u1)-°-(% — Un) = X7 — yx 1 +--+ 5,, so we get the polynomial 
f(x) = (x — a1)---(x — ap) by substituting in u; = a;. So now f(x) = x” — s1(a)x"-! + s(a)x™?2 +---+5,(a), 
and all s,(a@) € F because f is an element of F[x]. So now p(aj,-+-: ,Q@n) = O(51(@),--+ , Sp(a)), and all s)(a@) are 


elements of F, so a polynomial of them is in F. 


Basically, the symmetric functions are in the field because f(x) € F[x], and then we can use the symmetric function 
theorem to write p in terms of the symmetric polynomials. 

Here’s another game we can play with symmetric functions: let p(uy,--- ,Un) be a polynomial (not necessarily 
symmetric). We want to know about the orbit of p;. Its order divides n!, because we are permuting the n variables 


among each other. Let’s say this orbit is (~1,--- , Px): we can think of S, as operating on {pi,--- , px}. 


Lemma 252 


Let O(wm,---,W) be a symmetric polynomial in w. Then (pi(u),--- ,px(u)) is a symmetric polynomial in 


Uy,*** , Un. 


This is because S, operates on the polynomials, so when it permutes the ujs, it also permutes the pjs, and thus it 
fixes ®. 


Example 253 


Consider n = 3 with the polynomial py, = uyU> + uous. 


Then the orbit is py plus the polynomials 
P2 = U1U2 + U1 U3, P3 = U1U3 + U2U. 
As an example of a symmetric polynomial of these three elements (pi, p2, 23), we have 


= ee 3,2 
$3 = Pi P2P3 = 2uy U5 U3 + ) Uz U5 U3. 
symmetric 


Now, let's put these two ideas together: recall that a splitting field K of a polynomial f(x) € F[x] is where f 
splits completely into linear factors corresponding to its roots a@1,--- ,Q@,, and K is generated by the ajs. (This can 


be thought of as K being the smallest field containing the roots.) 
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Theorem 254 (Splitting theorem) 
Let K be the splitting field of f(x) € F[x], and let g be an irreducible polynomial in F[x]. Then if g has a root 


B € K, then g splits completely in Kk. 


(Note that g and @ are completely arbitrary here.) 


Proof. Let the roots of f be ay,--- ,@,. Every element of K is a polynomial in the as (since K is generated by them), 


so let p(u1,--+ , Un) be an element of F[uy,--+ , Up] such that 
B = p(a1,--+ Qn). 


Sn operates on polynomials in F[uy,--- , Un], so let {pi,--- , Px} be the orbit of p = py. Let Bj = pj(ar,--- , an), 
where 6 = GB, by definition. 

Our goal is to show that the polynomial h(x) = (x — B1)--- (x — Bx) has coefficients in F. If this is true, then 6 
is a root of h(x) and also of g(x), so they aren't relatively prime. But since g is irreducible, this would imply that g 


divides h: then because A splits in K, g must also split in K. Let's do an example for illustration: 


Example 255 


Let ¢ = e27//9 be a ninth root of unity. Its irreducible polynomial over F = Q is x® + x? + 1, and the roots in the 


splitting field K are ¢,¢?,¢*,€5, €", C8. Let's call them wy, uo,--+ , Us. 


Then what we're saying is that the minimal polynomial of 8 = ¢+C¢® = 2cos on must split: the orbit of uy + Ug is 


all polynomials of the form uj; + uj, where i 4 j. (There are (6) = 15 of them.) Then 


h(x) = [][(« — [ui + w)) 


i<j 


has degree 15, and B is a root of g(x) = x? — 3x +1 (this can be checked). And g(x) will factor as 


g(x) = (x = (C+ F(x = (P HOY) — (C+. ¢7)), 


and this is indeed a factor of h(x). 


So returning to our proof, how do we show that h(x) has coefficients in F? For variables wy,--- , Wz, let 
A(x) = (x — wi) --- (x = we) 
and we substitute w; = pj(u) in to get 
H(x) = (x — pi(u))(x = pe(u)) +++ (x = pe (u)). 
Now substituting u; = a;, we get back to the familiar 
h(x) = (x — B1) +++ (x — Bx). 


But H was symmetric in the wjs, and its coefficients x/ are s(M1, +++, We), the elementary symmetric polynomials in 
W1,-** , Wx. So the coefficients of x/ in H are Si(pi(u), +++, px(u)), which are symmetric in Uy,-++ , U, by Lemma 252. 


Thus the coefficients of x/ in h(x) are in F, because plugging in the roots a; (to get from H to h) gives us coefficients 


symmetric in the a;s, which are definitely in F. 
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31 May 1, 2019 


Today, we're going to start Galois theory — it'll take a bit of getting used to. 


Definition 256 


Let K and k’ be an extension field of F. Then an F-isomorphism @ from K to K’ is an isomorphism which is 


the identity on F. Similarly, an F-automorphism of K is just an F-isomorphism from K to itself. 


The important concept here is that set of automorphisms: 


Definition 257 
The Galois group G(K/F) is the set of all F-automorphisms of K. 


It turns out this group is the right way for us to understand splitting fields: 


Proposition 258 


If f is an irreducible polynomial in F[x], and it has a root a in K, then any F-isomorphism @: K — K’ takes a 


to another root of f. 


(This is an extremely important result to keep in mind!) 


Proof. By convention, let’s write f(x) = x" — a,x" ++---+a,. This will make things simpler, because now if f splits 
and has roots a1,--- , Qp, then a; is just the symmetric polynomial s;(a1,--- , @_) by Vieta’s formulas. 


So now because f(a) = 0, we also have (f(a@)) = 0, which means that 


(ar) — b(a1)b(ax)"* +--+ Gan) = 0. 


But the coefficients are fixed: $(a;) = aj, since F is fixed, and therefore $(a) is also a root of f, as desired. 


In particular, this also applies to F-automorphisms. 


Proposition 259 


If a@ is a root of f in K and a’ is a root of f in K’, then there exists a unique F-isomorphism @ : F(a) > F(a’) 


such that @ is sent to a’. 


This is because F(a) and F(a’) are both isomorphic to F[x]/(f), so they are isomorphic to each other. We can’t 


necessarily say that K and K’ are isomorphic, though — they might be much larger. 


Proposition 260 
Say that f(x) has roots a1,-+- ,Q@p, € K, and we can write K = F(a1,--- ,@,). Then the Galois group G(K/F) 


operates faithfully on {a1,--- , ap}: that is, if o fixes all ajs, then a is the identity. 


Proof. Every element of K can be written as a polynomial in the ajs, and now take @ of both sides — any o that fixes 


all ajs (as well as the field F) keeps all elements constant. 
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In fact, in general, knowing where the ajs go tells us where everything else goes. So that means that G is 
isomorphic to a subgroup of S, by its action on roots! 
Let's try to compute the Galois group for some small-degree polynomials. We'll start by looking at quadratic 


irreducible polynomials f(x) € F[x]. K, the splitting field of f over F, adjoins the two roots a; and a2 to F. 


Fact 261 


Again, remember that we are working in characteristic zero, so we have no repeated roots. 


In particular, our polynomial can be written as f(x) + x? — a,x + ao, where a, = a] + Qo,a9 = a1Q. By the 


proposition above, there exists a unique automorphism 
a: F(a) > F(a). 


Also, notice that we get the second root for free: a2 = a, — a, can be recovered (or vice versa), so both F(a,) and 
F(a) are just K. This, together with the identity automorphism, means that the Galois group is cyclic of order 2, 
and it’s generated by a transposition. 

Things get a lot harder as the degree gets larger, but the cubic is pretty easy. Let’s say f(x) is an irreducible cubic 
in F[x]: then we can write 


£00) S xP = 2 + aox = 23. 


We have roots @1, @, @3, and this means we have some tower 
FC F(ai) =F, C F(a, a2) = K, 


since we get the third root a3 = a, — a1 — Qo for free. We know the first extension has degree 3: all that’s left is to 
ask about the second extension. 


Well, we know that in Fy[x], f(x) factors as 
F(x) = (x — ax)q(x) 


for some quadratic q(x). The coefficients then involve a1: the two roots of the quadratic then have roots a2, a3 in 
K. There are two cases now: if q(x) is irreducible in Fy[x], then the degree of the extension has degree 2, and if q 


factors, then the degree is 1 (this means the other two roots are already in F,). So [K : F] is either 3 or 6. 


+ What can we say when [K : F] = 3? Then we have F(a,) = K = F(a) — both have degree 3 — and then there 
exists a unique F-isomorphism o from F(a,) to F(a2). o is an element of the Galois group: we ask what it 
does to a. It can’t go to itself (because a, is already going to a2), and if a2 goes to a, then we'd have to 
fix a3. But this can't be true — it would imply that o Is the identity, because there is a unique map that sends 


a3 to itself in K = F(a3)! So a2 goes to a3 and a3 goes to ay. 


This means that o must be a (123) cyclic permutation of the roots. Then (123) is the only thing that sends a1 
to @2, (132) is the only thing that sends a; to a3, and the identity sends a to itself. So in this case, G is the 


cyclic group generated by a, which is the alternating group A3. 


+ In the other case, if [K : F] = 6, then F(a,) 4 K, since [K : F(a1)] = 2. In addition, the quadratic polynomial 
q(x) as defined above is irreducible in Fy[x]. We've analyzed quadratics before: the Galois group of K over Fy 
is the cyclic group of order 2. Let's say it’s generated by 7. Then 7 fixes a1, since It’s an element of F,, and it 


also fixes F, since F C Fy. So T is in the Galois group of K/F as well: in particular, it’s the transposition (23). 
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We can repeat this argument if we extend Qo first, or a3, so G(K/F) also contains the transpositions (13) and 
(12). Specifically, this means that G(K/F) = S3. 


Corollary 262 


In both cases, we have 


[K : F] =|G(K/F)|. 


We're not quite done yet: how do we know which case we're in? Is there an easier way for us to tell whether q(x) 
is irreducible in Fy[x] without having to work in a new field? It turns out that the discriminant is the secret here: we 
know that 


2 2 2 
D(uy, U2, U3) = (uy — U2)" (ur — ug)“ (u2 — U3) 
can be written as some combination of the symmetric polynomials 51, So, 53. In particular, 
a 3 2.2 3 2 
D(a, 2,03) = —4aja3 + ajay + 18a, a2a3 — 4a5 — 27a3 © F 
because all of our coefficients a1, a2, a3 € F. We want to look at the square root 
6 = (uy — U2)(u1 — ug)(u2 — U3). 


Specifically, 6(a1, a2, a3)? = D(a, a2, a3). If D(a) is not a square in F, then 6 ¢ F: adjoining 6 = (a1 — a2)(ay — 
23)(Q2 — a3) yields 
[F(6): F] =2. 


But 6 is contained in the splitting field, so 
FCF(6)cCK => [K:F]iseven = [K: F]6. 


On the other hand, if D(a) is a square, then 6 € F: how does the symmetric group S3 operate on 6(u)? We can 
try it out: even permutations fix 6, and odd ones flip the sign of 6. Now 6 # 0, because the roots are distinct for 
irreducible f. This means that the Galois group does not contain an odd permutation: 6 € F, and F is supposed to 
be fixed. Therefore [K : F] = 3 as desired. 


Fact 263 
So the square root of the determinant tells us the Galois group for a cubic! Most of the time, it'll be Sz, 


because it’s pretty unlikely for an ugly expression in a1, a2, a3 to be zero. One case where we are happy is when 


there is no quadratic term: f(x) = x? + aox — a3 = > D(a) = —4a3 — 2723. 


So recall the following example from class earlier: if ¢ is a 9th root of unity, let ay = C4 C8, a2 = (724+ C',a3 = 
¢*++¢5. These are roots of x? — 3x + 1, and the discriminant is —4(—3)? — 27(1)? = 81, which is a square. That 


means adjoining a; gives the other two roots for free as well. 


32 May 3, 2019 


Today we're going to discuss the main theorem of Galois theory. 
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Theorem 264 
Suppose we have a finite group of automorphisms for a field K. Let F be the fixed field of G: that is, it’s the 
elements 

F={a€e Kloa=aVo€ G}. 


Then |G| = [K: F]. 


Proof. Pick an arbitrary a € K: it will have some orbit {a1,--- ,a@,}, where r divides n = |G| by the orbit-stabilizer 


theorem. Consider the polynomial 


f(x) = (x —a4)-+- (x —a,) = x" — sy (a)x 1 +---+5,(a), 


where the s;s are symmetric polynomials in the ajs. Since o permutes the orbit, it fixes each of the symmetric 


polynomials: this means that f(x) is fixed under all permutations, so f(x) is in the fixed field of G. In other words, 
s(a) Ee F => f(x) € Fix]. 


We should check that f(x) is irreducible: if we write f(x) = g(x)h(x), where a, is a root of g(x) without loss of 


generality, there exists some oj that sends o;(a@) = a; (this exists because the ajs form an orbit). So now 
g(o1) =0 => g(ai) =0 


under the permutation oj, so a; is a root of g(x). But this means all ajs are a root of g, so f divides g, meaning we 
can't factor f in the first place. 

So the degree of a@ over F, which is the degree of f, is equal to r, the number of roots of f. Choose a; so 
that r is maximal: our goal is to show that K = F(a;,). In other words, if 6 € K is an arbitrary element, we want 
to show it is in F(a,). Note that F(a,,8) > F(a,) D> F, where the second extension has degree r. We know 
that [F(a1,8) : F(ai)] < [F(6) : F], since polynomials are only “more irreducible” in F than in F(a1). So the first 
extension has degree has degree at most r, meaning that [F(a1,G) : F] is a finite extension. 

So now by the primitive element theorem, there exists y such that F(ai, 8) = F(7). Now [F(7) : F] < degay =r, 
because we chose a, maximally! But [F(ai,6) : F] > r from the chain of inclusions, so therefore we have equality 


on both sides, and therefore 
[F(a1,6): Fl =r => [F(a1,6): Fla1)]) =1 = > BE F(a), 


as desired. 
Now to finish, we want to show that this maximal r is equal to n, which will conclude the proof. By the orbit- 
stabilizer theorem, 
|G| = |Stab(az)||Orbit(az)], 


and now if o € Stab(a1), we know that o(a1) = a,. But that means a is the identity on all of F(a.) = K! Thus the 


stabilizer only has one element, and we're done: we do have |G| =r = [K: F]. 


Remember that an F-automorphism of a field extension K is an automorphism ao : K — K that is the identity on 


F. We define the Galois group of an extension K/F to be the set of F-automorphisms of K. 
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Theorem 265 
Let K bea finite extension of F, and let G be the Galois group G(K/F). The following are equivalent: 


- K is a splitting field over F. 


* |G| =[K : F] — in other words, K is a Galois extension of F. 


- F is the fixed field of G. 


Note that this gives us a way to find whether an element of K is actually in F: it depends on whether everything 


in the Galois group fixes it! 


Proof. Pick a primitive element a for K, so that K = F(a), and let f(x) be the irreducible polynomial for a in F[x]. 
Let a1,--- ,@, be the roots of f in K: we know that r<n=degf =[K: F]. 

We know there exists a unique F-isomorphism o; from K = F(a1) > F(a;), and because F(a;) also has degree 
n over F, we actually have F(a;) = K. So a; is an F-automorphism, and therefore o; € G. 

So G is just the set {o1,--- ,a,}, since each element of the Galois group must send a root to another root, and 
therefore |G| = r divides n= [K: F]. If |G] =[K : F], then r = n, which means f has rn roots and degree n, meaning 
f splits completely (which implies that K is a splitting field). This proves the equivalence of the first two statements. 

To show that the third is equivalent, F is contained in the fixed field (by definition of an F-automorphism), so we 
have 

PERG K, 


where [K : F] =n. But [K : K°] = |G| by Theorem 264, meaning that |G| = [K : F] if and only if [K° : F] = 1, and 


this gives us the result we want. 


Theorem 266 (Main Theorem) 


Let K bea Galois extension of F, and let G = G(K/F). Then subgroups of G correspond bijectively to intermediate 
fields FCLCK. 


This is possibly a bit surprising: G is a finite group, so this means there are only finitely many intermediate fields 
between F and K. In other words, picking any element of K and considering the field it generates only finitely many 


choices! 


Example 267 


Take F = Q, and K = F(¢), where ¢ = e27//". 


Now C is the root of the irreducible poylnomial x® + x° + x4 + x3 + x? +x +1, and the roots are ¢,¢7,--- , C9. 
Since K is the splitting field of f(x), we indeed have G = G(K/F), and the order of the group |G| = [K : F] =6. 

Note that we have an F-isomorphism o; sending F(C) > F(C¢') by sending ¢ — ¢’: by the same reasoning as 
before, each C! also has degree 6 over F, so this is the unique F-automorphism K —> K sending ¢ > C’. In particular, 
G is the cyclic group generated by a3, which is the map sending ¢ to C3. 

Now we have the F-isomorphism p = a7 which sends ¢ to ¢?. The subgroup generated by p has order 3: what’s 


the corresponding intermediate field? Consider the polynomial with roots that are the orbit of p under p: this is 


FR) = COR -O)e—O ae ae Oe OO SO xe I, 
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and let the x2-coefficient be GB; and the x-coefficient be Bo. We know that (1, Bo are in the fixed field of K‘, and 
now 
(x — Bi) (x — Bo) =x? +x +2. 


So we have 6, = ST and 6. = = —, and the intermediate field is just | Q(61, G2) | (we really only need to 


adjoin one of them). 


33. May 6, 2019 


Recall that last week, we started talking about some ideas of the main theorem. Here’s some of the ideas that we 
proved last time: if we have a finite group G of automorphisms of a field K, and we let K° be the fixed field of G, 
then [K : K°] = |G|. We defined a Galois extension K of F to be a field extension that satisfied any of the following 


equivalent conditions: 
- K is a splitting field over F. 


G(K/F)| =[K: F]. 
- F = KS, the fixed field of K under G. 


This leads us to Theorem 266: if K/F is a Galois extension, then we have a bijective correspondence between 
intermediate fields F Cc L C K and subgroups of G = G(K/F). Specifically, any intermediate field L corresponds 
to the Galois group G(K/L) = H, so F corresponds to the whole group G, and K corresponds to the identity. In 


particular, making L larger makes H become smaller. 


Example 268 


Let's illustrate this with another example: take F = Q and K = F(a,B), where a = /D. C= J3. 


We have a chain F c F(a) C F(a, 8) = K. The first chain has degree 2 because x? — 2 is irreducible, and 6 is a 
root of x? — 3, which remains irreducible in F(a). So [K : F] =[K : F(a)][F(a) : F] = 2-2 = 4, and now because 
K is the splitting field of (x? — 2)(x? — 3), |G(K/F)| = 4. 

To find the Galois group, first note that any automorphism that fixes both V/2 and V3 fixes everything. If o € G, 


then oa must send a@ to one of the other roots of (x* — 2), +a. Similarly, 78 must be one of +6. There's only 


four choices in total, so we have the Klein four group generated by o (sending a + —a,8 — B) and T (sending 
a->a,B—- —£). 

From this, we can find the intermediate fields between F and K. The only nontrivial subgroups of the whole group 
G are (a), (7), (oT). By the main theorem, any subgroup H corresponds to L = K": this means we get F(8), F(a), 
and F(aB), respectively. 


Example 269 
Let’s let K be the splitting field of x3? — 2 over F = Q: then the roots are a = \/2, aw, aw, where w =e 


2ni/3 is 


a cube root of unity. 


This time we have the chain 
FC F(a) C F(a,wa) = F(a,w) = K, 


where the extensions are of degree 3 and 2 respectively. So now [K : F] = 6, and G operates faithfully on the roots: 


this means G must be isomorphic to S3, and we can describe each element of G by the permutation on the roots. 
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Our diagram of subgroups is still not that interesting: the subgroups are A3, ((23)), ((13)), ((12)). The index 
of the subgroup [S3 : A3] = 2, and this corresponds to an extension of degree 2. (Specifically, we have F(6), where 
6 = (a1 — A2)(a1 — a3)(Q2 — a3) is the square root of the discriminant.) Meanwhile, [S3 : ((23))] = 3, which 


corresponds to F(a). 


Example 270 


What if f(x) is an irreducible quartic with roots a1, @2, a3, 4? 


We have our chain 
FC F(ay) C F(a1, a2) C F(ay, a2,a3) C K, 


where the extensions have degree 4, < 3, < 2,1 (because ak is a cubic in F(a,), but it might not be irreducible). 
So we have that |G| = [K : F] < 4!, and let’s look at the case where | G = S,4| (so we can permute the roots however 


we'd like). 

Let's think about the subgroups of S,. We have the alternating group Aq of order 12, which has index 2. By 
the Sylow theorem, there exists a Sylow 2-group with order 8 (and therefore index 3), and it turns out to be three 
(intersecting) dihedral groups D4. S4 also contains four copies of S3 (just fix the first index) of order 6 (and therefore 
index 4). We also have smaller groups — those generated by a 4-cycle, and so on. 

This gives uS a much more complicated diagram. We'll discuss this a bit more on Wednesday, but here’s a 
few examples: [S4 : S3] = 4 corresponds to adjoining a root F(a,), and [$3 : So] = 3 corresponds to adjoining 
another root F(ay,a@2). [S4 : As] = 2 corresponds, again, to adjoining the square root of the discriminant 6 = 
(a1 — 2) (1 — O3)(Q1 — 4) (2 — 13) (2 — Ag) (AZ — Oa). 


We're now ready to prove Theorem 266, the Main Theorem: 


Proof. We'll show that given an intermediate field L, we can map it to G(K/L) = H and show that L = K". Also, 
given a subfield H, we'll show we can map it to L = K” and show that G(K/L) = H. This would show that the map 
is indeed bijective on both sides, as desired. 

If we have a field L, note that K being a splitting field over F is the same as K being a splitting field over L (just 
take the same polynomial, which will split in the same way). So K is a Galois extension over L as well, meaning that 
|G(K/L)| = [K : L]. Now if H is any subgroup of G, we know that [K : K”] = |H| by Theorem 264, and now we 
can let H = G(K/L). This is some subgroup of G, and in particular, L c K" Cc K (the first inclusion because every 
element of H fixes L by definition). And now [K : K“] = |H|, and [K : L] = |G(K/L)| = |H], so that just means we 
have equality: L = K" as desired. 

On the other hand, if we're given a subgroup H, let L = K". We know that [K : L] = |H|, and since K is a Galois 
extension of L, [K : L] =|G(K/L)|. The two groups H and G(K/L) have the same order, and we know that 


HC G(K/L) CG 


where the first inclusion comes from everything in H being fixed by L. But now both [G : H] and [G : G(K/L)] are 
equal to [K : L], so H = G(K/L). 


To see why this is called the Main Theorem, we need to spend a bit more time working with it to understand its 


significance. Let's go back to cubic equations: this time, 
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Theorem 271 
Suppose that [K : F] = 3, and say that K C C (to save time). Also, suppose that the cube roots of unity 


w = e2™'/3 are in our ground field F. Then K is obtained by adjoining a cube root: K = F(y), where 7° € F. 


Proof. The Galois group has order 3 in this case, and it is cyclic: if we let the roots of f be a1, a>, a@3, we can use the 
fact that that K is an F-vector space of dimension 3. Letting o be a generator of our group G(K/F), o is a linear 


operator, because it is an automorphism of K. Specifically, if a € F, 
a(aB) = o(a)o(B) = ao(B), 


so multiplying by scalars is consistent with our operator. In addition, o? = 1 is the identity operator, so the eigenvalues 
of o are cube roots of 1. They can't all be 1: this is not obvious, but it’s related to a problem on the problem set 
about diagonalizing! So there exists some eigenvalue A = w or @, and both occur (but we don’t need to check that). 

So now take the eigenvector to be our element y € K: we have o(7) = Ay, and a(y?) = A947? = 9°. Since o 
generates the Galois group, y? € K° = F. Furthermore, [K : F] = 3, so we must have K = F(¥4), and therefore 
K = F(q), as desired. 


By the way, this theorem works for any prime p (not just 3). We might come back to some of these ideas next 


time! 


34 May 8, 2019 


Let’s say we have a splitting field K of a field F (with an irreducible polynomial f € F[x]), and the roots of f are 


Q1,°*: ,Q@p,. There are basically two questions we want to ask: 


+ What is G(K/L), the Galois group of K over L? 


* What are the intermediate fields L between F and K? 


We know that the subgroups of G correspond to the intermediate fields L by the main theorem, but the correspon- 


dence may not be easy to deduce explicitly. 


Example 272 


Let’s go back to the case where the degree of f is 3. 


Then we either have G = As if [K : F] = |G| = 3, or we have G = Sz if [K : F] = 6. In this case, the degree of 
the field extension tells us a lot. (Unfortunately, for higher degrees, it’s about as hard to find the order of the group 
|G| as it is to find [K : F].) 

In particular, we found that G = As if and only if the discriminant of f is a square in F: this gives us a good answer 


to the first question above. 
Question 273. What do we know about G in general, though? 


It's true that G permutes our roots Q1,--- ,@p,, and the operation of G on the roots is faithful: fixing the roots 
fixes the entire field K. This means we have an injective homomorphism from G to S,, and therefore G is a subgroup 


of S,. But we actually know a little bit more: 
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Proposition 274 


The operation of G on {aj,--- ,@,} is transitive (they form an orbit). In other words, for all 1 < i <n, there 


exists a 0 € G such that o(a;) = Qj. 


Proof. Let {a1,--- ,ax} be the orbit of a;, and let h(x) = (x — a1)---(x — ax). This is a factor of f(x), and the 
coefficients of h are symmetric functions of a1,--- ,@,%. But if they form an orbit, every o in G permutes a@j,--- , Qk, 
which means it fixes the symmetric functions. 

But the fixed field of G is F, and therefore the coefficients of h(x) are forced to be in F! Now since f(x) is 


irreducible, we must have f = h, meaning the orbit of az is indeed {a 1,--- , ap}. 


So G is isomorphic to a transitive subgroup of S,, and that’s how we'll identify our subgroups. One important 


point: every element of S, permutes the roots, but not all permutations are automorphisms of the field. 


Example 275 


Now let's do the case where the degree of f is 4. 


The Galois group is now a transitive subgroup of S4, which operates on {aj,--- ,a@4}. Since the orbit has order 
4, |G| must be a multiple of 4 and divide 24: the options are 4,8, 12, and 24. 

What is the group in each case? G = Sy for order 24 and Ag for order 12 (these are the only subgroups of those 
sizes). The group of order 8 exists is one of the Das, and the group of order 4 can either be Cy or Dz. (Remember 
that [K : F] = |G| in all of these cases.) 


Well, we have our discriminant 
D = (a4 — 2)? (a4 — a3)? (01 — 4)? (2 — 013)? (42 — O14)? (13 — 14)? 


This has degree 12 in the as, and we can think about permuting the roots. Switching any two of the roots keeps D 


constant, but what does it do to 
VD=6= (Oty = Oia (a — Gig) +++ (Og = Oa)? 


It turns out that odd permutations take 6 — —6, and even permutations take 6 — 6, just like in the cubic case. So 
6 is fixed only by even permutations, and therefore, D is a square in our field F if and only if G only contains even 


permutations. 


Thus, D is a square if and only if | G C Aq |, which means we have one of the groups Aq and Do. 


Fact 276 


Lagrange wrote a long paper on quartic equations, and the only thing that people remember from it is to figure 


out whether the Galois group has an element of order 3. 


Consider the three elements 
By = 1A + 013014, Bz = 01013 + A2O4, Bz = A104 + A203. 
Under S4, {81, Bo, 63} is the orbit of 61. Now 


g(x) = (x — B1)(x — Ba)(x — Bs) 
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has symmetric coefficients in the 6js, so it’s also symmetric in the ajs. For example, 6G; + Go + 63 is the sum of 
pairwise products ajaj, which is just the x-coefficient of f(x) (the second symmetric function). 


So now in the special case where f(x) = x* — px + q (so p = $3(a) and q = s4(a)), we have that 
g(x) = x° — s1(B)x? + s(8)x — 53(8). 


51(G) has degree 2 in the as, 5o(G) has degree 4, and 53(() has degree 6. Thus s;(G), which has degree 2, needs to be 
a combination of s;(a@) and so(a@) — both are zero in this case, so it’s zero! s9(G) can only be written as a combination 
of s3 and s4 with degrees 3 and 4 respectively, so s2(8) = aq, and similarly s3(G) = bp* for some constants a, b. 


Much like in the problem set, we can now compute two particular polynomials to find the values of a and b: taking 


f(x) = x*— 1, we have roots 1, /, —1, —/, and now B; = 2/, B2 = 0, B3 = —2/. This gives us that a = —4, and similarly 


we can take f(x) = x* — x with roots 0,1,w,@, and we can find that b = —1. 


So when f(x) = x* — px + q, we have that | g(x) = x? — 4qx — p*|. Now we can ask whether g(x) is irreducible 
in our field F: if it is, then F C F(6,) C K, so 3 divides [K : F] = |G], and therefore we're either in A, or Sy. On the 


other hand, If g is reducible, then 3 doesn’t divide the order of the group, which gives us one of the other cases. 


There's just one bug in this: what if two of the 6;s are equal? Then we lose control, since we need to make sure 


the group acts on three elements! Luckily, the Gs are distinct, because the discriminant can be written as 


D(g(x)) = (61 — B2)(B1 — B3)(B2 — Bs) 


and now we have a miracle: 6, — Bo = (a1Q2 + a3a4) — (103 + A204) = (a1 — a4)(Q2 — a3), which means that 
D(g) = D(f) #0. Therefore our roots are indeed distinct! 


So now we can make a table: 


| g is irreducible g is reducible 


D square Aa D> 


D not a square S4 Dg or C4 


It's not that easy to figure out that last ambiguity in general, but in general we should expect G to be Sy. 


Example 277 


What's the Galois group of the splitting field of x4 +x + 1 over Q? 


We have g(x) = x? — 4x — 1, and both f and g here are indeed irreducible. They have the same discriminant 
D(g) = —4(—4)° — 27(-1)? = 256 — 27 = 229, 
which is not a square. That means that the Galois group G is the symmetric group Sy. 


Example 278 


What if we have the irreducible polynomial f(x) + x* + x 1 over Q? 


We have the roots ¢, €?, €3, C4, where € = e27'/5 is a fifth root of unity. Then K = F(C) has degree 4 over Q, 
and any automorphism is determined by where o goes (since the other roots are just powers of it)! For example, the 
automorphism ¢ > ¢? sends 

(=C SCS 


That means o = (1243) generates the Galois group, and thus the group is cyclic of order 4. 
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Example 279 


What if we want the splitting field of (x? — 2)(x? — 3) over Q? 


Then a = V2 and B= V3, and now y = a+ is a primitive element: it’s a root of x* — 10x? + 1, which is 
irreducible. But now y2 = 5+ 2,6, and we can similarly find that G is D> using the chart. 
We'll have a quiz on Friday! 


35 May 13, 2019 


(We got more cookies in class today.) Today's topic is adjoining roots of unity of the form e27//P where p is a prime. 
First of all, let's review the Main Theorem: if we have a Galois extension of F with Galois group G, the intermediate 


fields between F and K correspond exactly to subgroups of G. Specifically, the correspondence is 
FoL=K"cK 


GDH=G(K/L) 2D {1}, 


where K/L is a Galois extension with Galois group of order |H| = [K : L], and therefore [L : F] =[G: H]. 


Proposition 280 


L is a Galois extension of F if and only if H is a normal subgroup of G. In this case, we have G(L/F) = G/H. 


Example 281 
If K is the splitting field of an irreducible cubic f, and the Galois group is S3, then S3 contains A3 and three cyclic 


groups of order 2 (generated by transpositions). 


Then K%: is a degree 2 extension over F: all such extensions are Galois, since all subgroups of index 2 are normal. 


But the other extensions are not Galois, because the transpositions do not generate normal subgroups of S3. 


Example 282 


Now let’s take F = Q and K = F(G,). 


We know that ¢, is a root of x?~!+---+x +1, which is irreducible over F (as derived earlier in class). We have 
the roots ¢,¢€?,--- ,CP-4, and [K : F] = p—1. 

But all of those roots “look the same” from the perspective of Q: specifically, there is a unique automorphism 
sending ¢ to any C® for all 1 < s < p—1. So those describe all of the automorphisms: we should interpret the 


exponent a in ¢? as being an element of F,. After all, if o(¢) = ¢8, 7(¢) = CF, 
o7(¢) = a(7(¢)) = o(6") = (o(Q))F = CC. 


So composition of automorphisms corresponds to multiplying the exponents (and reducing mod p). Therefore, the 


Galois group is of the form G = Fz, which is a cyclic group of order p — 1. 
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Example 283 


Let p = 13: 2 is a primitive root in Fy. 


Then the powers of 2 are 
1,2,4, 8,3,6, 12,11, 9,5, 10, 7. 


So the Galois group is a group of order 12 generated by a, which sends o(¢) = C? (then ¢? get sent to ¢*, and so on). 
Every subgroup of a cyclic group is cyclic, and we need the order to divide 12. To draw out our diagram of fields, 


note that the subgroups are nested via 


and 


with the additional restriction that (0°) C (a7). Let’s do some example computations! 
First of all, how can we find the intermediate field corresponding to 07? It sends ¢ to ¢*, so we care about 
the orbits 
[1, 4, 3, 12, 9, 10][2, 8, 6, 11, 5, 7]. 


Letay =€4+ 047406407 +094", and let as be the sum of the other roots of unity. Then o sends a; to a» and 


vice versa, so let’s calculate 
(x = a1) (x = aa). 


Then ay + Q2 = —1 (it’s all the roots of unity except 1), and a,a@2 is the sum of 36 different terms. There's no zeros 


mod 13 (because k and 13—k always appear in a or in a2), and now a fixes the product a@1Q@2, and is therefore in F. 


Lemma 284 


The only combination of the roots of unity satisfying 


p-1 


Dice —0) 


i=0 


In other words, the sum of the nontrivial roots of unity can only be rational if they all have equal contribution! 
this means we get 3 copies of all 12 roots, which means the total sum is —3. That means a 1, @2 are roots of the 
polynomial 

x? —x +3, 


and thus a,2 = “3, and our intermediate field is Q(V13). 


Next, how can we find the intermediate field corresponding to o?? We have the orbits 
= [1, 8, 12, 5][2, 3, 11, 10][4, 6, 9, 7]. 


Calling the elements 6B, = ¢ + C8 + ¢12 + ¢° and similarly for the other orbits, we know that o permutes (1, Bo, B3 


cyclically. That means the (js are roots of the polynomial 


(x — B1)(x — B2)(x — B3). 
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We know that s;(G) = —1, 5o(@) is a sum of 42-3 = 48 terms, none of which are 1, so it’s —4, and s3(G) is a product 
of 64 terms, so there are some number of zeros. It’s unlikely there’s 16 or more, so there are 4 (technically), and now 
53(8) = —% + 4=-—1. That means the Gs are roots of 


x3 + x2 — Ax +1. 


We know that the Galois group (o°) has order 3, so we can plug this polynomial into the formula for the discriminant: 
it'll turn out to be a square. 


Finally, how can we find the intermediate field corresponding to 0°? We have orbits 
[1, 12][2, 11][4, 9][8, 5][3, 10][6, 7]. 


Calling the sums of the powers of Cs here ¥1,--- , Ye, we know that y, = €+¢7! = 2cos ar We can find the product 
(x — 91) +++ (* — Ye), but that’s ugly and sad. 


Example 285 


Instead, let's do an analogous calculation for p = 7 — 3 is a primitive root there. 


This gives us the orbit [1, 3, 2,6, 4,5], and now we'll find the intermediate field corresponding to a3, which gives 
us the orbits 
[1, 6][3, 4][2, 5]. 


Now it's reasonable to compute (x — ¥1)(x — Y2)(x — ¥3). Remember that we have the relation 
(a+b) = (2? +6) 4 3(a* b+ ab"), 


and since yy =y=C+C}, 
P=(C+C%) 430407), 
and 
Pale re?) -2. 


This yields a linear relation between 1,-y, y7, 7°: 


Pty? 2y-1=0. 


Example 286 


To finish, let’s do p = 17: notably, p — 1 Is a power of 2, so we can construct a regular 17-gon. We have a 


primitive root 3 here. 


o permutes the roots via 
[1,3, 9,10, 13,5, 15,11, 16, 14, 8, 7, 4, 12, 2, 6]. 


o? splits this into two orbits: 
[1,9, 13,15, 16, 8, 4, 2][3, 10,5, 11, 14, 7, 12, 6]. 


Denoting the sums a1 and Qo, a, +Q@2 = —1, and a ,Q@> Is the sum of 64 terms that are all not 1, and therefore it's 
—% = —4, So our polynomial 
1+vV17 
e4+x-4 > a= ae ae 
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so our intermediate field is Q(V17). 


Next, if we want the field corresponding to o*, we take 
[1, 13, 16, 4][9, 15, 8, 2][3, 5, 14, 12][10, 11, 7, 6]. 


Note that 6; + Bo = a, here, and G,G2 has 16 terms with no zeros that turn out to be —1. That means (1, Go are 
roots of x* — a,x — 1: in other words, [F(8) : F(a)] = 2, and now we can just compute using the quadratic formula 
to find that 


1 f-1+Vi7 , /17+VI7 


B=35 9 2 


Finally, what about the field corresponding to o®, which contains cos an? Similarly, ify. = C+¢7° and yo = €84¢4, 


then 71 + Yo = Bi and 172 = [14,5, 12,3] = Bs. So 7172 Is a root of x? — Bx + Bs, and that means we can write 
out (in nested square root form) the value of ¥ as well. 


We'll talk about the quintic equation next time! 


36 May 15, 2019 


Today, we're going to not solve the quintic equation. 


Proposition 287 (Cardano’s formula) 


If we have a cubic f(x) = x? + 3px + 2q, then there is a root 


a= a+ VP +P +H C/G. 4 pe 


This is useless for our purposes (and also in general) though. Let's try to do a study of this problem that’s not so 


stupid: 


Definition 288 


A polynomial f(x) is solvable in terms of field extensions if there exists a chain 
= Jaye ia Cor ia. 


such that Fi+1 = Fi( %/aj) for some prime p; and some a; € F;, and f has a root in the last extension Fx. 


This looks a little bit more general! Note that each extension can be restricted to having a prime p in the index of 


the square root, because W/a = ¥/WWa. 


Proposition 289 


Suppose we have an element a € F, and a@? = a for some prime p. If Cp = e27!/P is in F, then K = F[a] is the 


splitting field of x? — a over F: the roots are just a, Ca, (2a, and so on. 


(Note that a priori we don’t necessarily know that the polynomial is even irreducible. ) 


Proof. |f G is our Galois group, then any o € G needs to send a root to another root, so aa = ¢°a for some s. If 
tT €Gas well, and Ta = Cta, then 


oT (a) = o(C'a) = Cola) = Ca 
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since we assume € € F. This means that G must be isomorphic to a subgroup of the additive group Ee: since p is 
prime, the only possibilities are the identity or the whole group. 


This means that if a is not in F, the extension isn’t trivial, so G is cyclic of order p. Therefore, if Cp € F,a € F, 


then x? — a is either irreducible or it splits completely in F[x]. 


(¢C needs to be in F: for example, consider x? — 8 as a counterexample otherwise. ) 

And if we want ¢, (to satisfy the assumptions above), we can just adjoin it to our field. Let F = Q, and let 
K = F(C¢,) for some prime p. Then K is a Galois extension with Galois group F>, which is cyclic. Thus, the idea is 
that we can always reach our final field K (for a solvable polynomial) with a sequence F = Fo C Fy--- C Fe = K, 


where Fj41 is a Galois extension of F; with (cyclic) prime order. 


Example 290 


We can get Q(¢s) with the following method: 


We want to use two quadratic extensions to get a Galois group of order 5 — 1. It turns out we (probably) use 
Fo = 2. QS) <0 (V5 -(5 + v3)) 


This means that whenever we want to figure out whether we can adjoin roots, we can just start by putting in all of 


the roots of unity in our field F that we need. In other words, if f is solvable, then there exists a chain of fields 
F=Fo CF, C-:: CF, 


where Fj41 is a Galois extension of Fj of prime degree: G(Fj41/F) = Cp, for some prime p;, and f has a root in Fy. 


(The converse is true, but let’s not worry about that.) 


Example 291 


Let's prove that it’s possible to solve quartics (at least theoretically). 


Start with Fo = F, and adjoin a cube root of 1 (so Fy = F(¢3)). Then Fo = Fy(WD) adjoins the square root 
of the discriminant, and F3 = Fo(G1), where 61, G2, G3 are the roots of the resolvant cubic that we discussed earlier: 
By = a1QA2 + a3a4 and so on. Now if K is the splitting field of F with ¢3 adjoined, the Galois group of K over Fo is 
contained in the alternating group (because VD is now fixed). Then, when we adjoin 61, we can no longer have an 
element of order 3 for future extensions (because our fs are fixed), so G(K/F3) C Do. That means we can get to K 


with at most 2 more square roots! 


Theorem 292 
Let f be a polynomial have degree 5 in F[x], and let K be the splitting field of f. If G = G(K/F) Is either Ss or 


As, then f is not solvable. 


Proof. We can assume G = As, because otherwise we can just adjoin the square root of the discriminant. Suppose 


for the sake of contradiction that such a chain does exist: 
F=Fo CF, C-:- CF. 


Let K = Ko be the splitting field of f in Fo, let K, be the splitting field of f over Fy, and so on. Then we have 
Ko C Ki C-++ C Kx. 
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Our goal is to show that the Galois group is G(K,/F,) = G(K/F) = As. If we show that, then the polynomial can't 
have a root in Kx, because if it did, the Galois group would not contain a 5-cycle! That would give us a contradiction, 
because Kx is supposed to be the splitting field of f over Fx. 

We do this by induction: it’s enough to show that G(K’/F’) = G(K/F) for one level of the chain. Note that F’ 
is an extension of F with Galois group C, for some prime p, and then we have the Galois groups G(K/F) = As and 
G'(kK'/F"). Since F’ is a Galois extension of F as well, it’s the splitting field of some polynomial g over F. Then k’ 
is the splitting field of f over F’, but it’s also the splitting field of g over K (because we basically adjoin everything 
from f and also from g). That means that K’ is the splitting field of fg over F, and therefore K’/F is Galois with 
some Galois group G. 


But by the Main Theorem, K is an intermediate field in the chain 
FCKCK', 


so K = K’™, where N is some normal subgroup of G (importantly, it’s normal because K/F is Galois). This means G 


is isomorphic to G/N. On the other hand, we also have 
FCF CR’, 


so F’ = K'®, where G’ is some normal subgroup of G. That means Cp is isomorphic to G/G’. 

So now we can map @: G — G x C, by taking the residue in each quotient map. The kernel of this map, ker ¢, 
consists of all o € G that operates trivially on both K and F’, but K and F’ generate kK’! Therefore, the kernel is 
trivial and @ is injective, and now the order of the group G must be either |G| or p|G]. 

If we're in the case where |G| = |G], then we have a bijective map, and G = As (which is simple). But then it can’t 
map surjectively to C, because that would imply C, is a quotient of As, which is a simple group! Therefore |G| = p|G|, 
and we have a bijective map @: G — G x C,. Therefore the map from G to C, is just the projection G x Cp + Cp, 
which has kernel G’. But the kernel from G — C, is G, so G is isormorphic to G’. Thus G’ = As, and that means that 


we've made no progress towards solving the quintic. 


Galois then wanted to write down a polynomial whose Galois group was actually As or Ss. He proved the following 


lemma somehow: 


Lemma 293 


Let G be a subgroup of Ss (actually true for S,) that contains a 5-cycle and a transposition. Then G = Ss. 


Proof. This can be basically directly verified: Just take various powers of the 5-cycle composed with the transposition. 


We can find a 3-cycle, a 4-cycle, and a 5-cycle, so the order of the group divides 3-4-5 = 60. It’s not As (because 


there exists an odd permutation), so it is Ss. 


To finish the class, let's write down a quintic that is not solvable. Take F = Q: note that x°—16x = x(x?+4)(x?—4) 


has three real roots. If we add a little constant, like | x° — 16x +2], we get an irreducible polynomial by Eisenstein. 


We still have 3 real roots (because we haven't perturbed our function too much). 
The Galois group operates transitively on 5 roots, so it contains a 5-cycle. Now F C F(a, Q@2,a3) C K, where 
Q1, 2, 3 are the real roots of f. Since K/F (a1, a2, a3) has degree 2, there must be an element of the Galois group 


that switches the two complex roots. So we have a 5-cycle and a transposition, and thus the Galois group is Ss! 
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